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PREFACE 


This  publication  represents  the  proceedings  of  the  Military  Operations  Research 
Society  (MORS)  Simulation  Validation  Workshop  held  March  31  -  April  2,  1992,  in  Alexandria, 
Virginia.  This  workshop  was  one  of  a  continuing  SIMVAL  series  that  MORS  has  had  in  the 
area  of  Simulation  Validation.  It  contains  the  reports  (Chapters  II- V)  of  the  chairs  of  the  four 
worl  ing  groups  into  which  the  workshop  was  organized.  It  also  contains  two  other  reports,  one 
by  the  overall  SIMVAL  senes  co-chairs  (Chapter  1),  and  one  by  Dr.  Paul  Davis  (Chapter  VI) 
which  he  wrote  based  on  the  series  activities  and  his  other  efforts  in  verification,  validation  and 
accreditation  (VV&A). 

Chapter  /,  Overview,  provides  the  overall  SIMVAL  approach,  its  history,  the  basic 
definitions  and  describes  an  emerging  picture  of  VV&A.  Chapter  II,  The  Basics,  presents  a  look 
at  three  major  areas  supporting  VV&A:  documentation,  configuration  management  and 
indcpiendent  review.  Chapter  III,  Verification,  provides  an  overview  of  verification  and  the 
major  methods  of  verification.  Chapter  IV,  Validation,  is  divided  into  four  parts.  Part  I, 
Validating  Models  and  Simulations,  describes  the  overall  structure  for  validation,  methods  and 
considerations  in  condv<'t<ng  a  validation  effort,  and  a  validation  documentation  approach.  Part 
II,  The  Multidimensional  Space  of  Validation,  provides  another  way  of  viewing  the  overall  area 
of  validation.  Parr  III,  Face  Validation  and  Face  Validity  and  Part  IV,  Sensitivity  Study  of  a 
Simulation  Model,  describe  two  validation  methods,  face  validation  and  sensitivity  analysis,  and 
considerations  in  their  use.  Chapter  V,  Accreditation,  addresses  the  area  of  accreditation,  its 
intent,  considerations  in  application  and  philosophy  of  use. 

There  are  many  different  views  of  the  VV&A  area.  Chapter  VI,  A  Framework  for 
Verification,  Validation  and  Accreditation  is  one  of  these,  which  is  logical,  consistent  and  held 
by  many  in  the  VV&A  community.  It  is  provided  to  demonstrate  that  work  continues  in  the 
VV&A  area  by  many  dedicated  professionals  and  by  many  government  and  industry 
organizations. 

It  is  a  basic  premise  of  the  SIMVAL  series  that  its  findings  represent  a  consensus  of 
the  military  and  military  support  community.  Getting  a  consensus  on  an  accepted  VV&A 
structure  will  take  continued  cooperation  and  support  of  all  those  several  hundred  people  who 
have  contributed  in  the  past  three  years  as  well  as  the  many  others  who  will  participate  in  the 
future.  This  publication  is  a  status  report  of  the  SIMVAL  series.  Future  activities  will  bring 
further  definition  and  understanding  to  the  overall  structure  and  methodologies.  This  status 
report  will  evolve  to  represent  those  new  findings. 


James  J.  Sikora 
SIMVAL  Co-Chair 
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CHAPTER  I  -  OVERVIEW 


by  Marion  L.  Williams  and  James  J.  Sikora 


1.0  INTRODUCTION 

Simulation  has  been  an  important 
operations  research  tool  for  many  years. 
War  gaming  has  helped  develop  strategy; 
campaign  models  have  provided  assessments 
of  system  utility;  engineering  simulations 
have  assisted  in  design  of  systems  and  tech¬ 
niques.  However,  the  complexity  has 
changed  from  simple  algorithms  to  hundreds 
of  thousands  of  lines  of  computer  code;  the 
number  of  simulations  has  grown  from 
hundreds  to  thousands;  and  the  emphasis  has 
changed  from  providing  insight  to  that  of 
providing  input  to  decisions  on  major  sys¬ 
tems.  With  these  changes,  decision  makers 
are  asking  for  some  assurance  that  the  mod¬ 
els  faithfully  represent  those  aspects  of  the 
real  world  that  are  important  to  the  problem 
at  hand.  We  have  been  slower  to  provide 
this  assurance  than  we  have  been  to  develop 
new  models. 

“Validation”  is  not  an  easy  term  to 
define.  It  means  one  thing  for  the  radar 
range  equation  or  the  equations  of  motion  of 
a  satellite.  It  is  quite  another  thing  for 
interactions  in  a  battlefield.  In  another 
dimension,  it  is  less  imf  ortant  in  some 
applications  than  others.  For  some  applica¬ 
tions,  validation  is  of  little  importance;  for 
others,  it  isn’t  possible.  However,  as  the 
results  of  simulation  are  presented  to  support 
DoD  studies,  the  question  frequently  asked 
is:  “Has  your  model  been  validated?”  In 
the  case  of  operational  testing,  the  question 
is  “Has  your  model  been  validated  with  field 
test  data?”  Currently,  OSD  guidelines  on 
operational  testing  require  that  models  used 
to  support  evaluations  be  “accredited.” 


1.1  BACKGROUND 

To  address  these  issues,  MORS 
sponsored  a  series  of  activities  on  "Simula¬ 
tion  Validation.”  The  series  of  activities 
that  support  the  results  for  this  monograph 
are  shown  in  Figure  1-1.  The  first  activity, 
a  mini-symposium  held  in  Albuquerque, 
New  Mexico,  October  15-18,  1990,  was 
hosted  by  the  Air  Force  Operational  Test 
and  Evaluation  Center  and  BDM  Internation¬ 
al,  Inc.  The  mini-symposium  provided  a 
forum  for  general  discussion  of  the  broad 
topic  of  simulation  verification,  validation, 
and  accreditation  and  served  as  a  basis  for 
planning  future  efforts. 

Objectives  of  the  mini-symposium 
were  to: 

•  Review  current  efforts  in 

,  .simulation  validation; 

•  Support  technical  interchange 
on  simulation  validation; 

•  Develop  consensus  on  a 

consistent  set  of  definitions 
for  ternis  such  as  “verifi¬ 
cation,”  “validation,”  “ac¬ 
creditation,”  etc. 

•  Develop  a  plan  for  future 

efforts  to  address  issues  of 
simulation  validation. 

The  mini-symposium  was  divided 
into  five  major  sessions:  Requirements 
Analysis;  System  Design;  Operational  Test 
and  Evaluation;  Operations  Support  and 
Tactics  Development;  and  Training.  Papers 
for  these  sessions  included  case  histories, 
methodologies,  lessons  learned,  and  status  of 
current  simulation  validation  efforts. 
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Figure  1-1.  Simulation  Validation  Series  Activities 


A  Senior  Advisory  Group  (SAG), 
composed  of  senior  analysts  representing  a 
breadth  of  simulation  experience,  was 
formed  to  provide  guidance  in  planning  the 
workshop  series,  to  assist  in  developing  a 
consistent  set  of  definitions,  and  to  develop 
a  roadmap  of  activities  necessary  to  arrive  at 
a  consensus  on  a  model  validation  process. 
The  SAG  membership  is  shown  in  Figure 
1-2.  The  goal  of  the  SAG  was  to  arrive  at 
a  consistent  set  of  definitions  for  simulation 
verification,  validation  and  accreditation 
which  would  be  agreeable  to  all  DoD  Com¬ 
ponents,  thus  resolving  the  problems  caused 
by  the  current  use  of  different  definitions. 

The  SAG  recommended  a  subsequent 
meeting  to  provide  a  better  description  of 
the  validation  methodologies.  To  accom¬ 
plish  this,  an  ad  hoc  working  group  meeting 
was  held  at  The  MITRE  Corporation  on 
December  12-13,  1990,  with  DoD  Compo¬ 
nent  and  industry  representatives,  llie 
purpose  of  the  meeting  was  to  attempt  to 
define  elements  of  a  validation  process. 
Experts  in  five  different  types  of  application 
areas  were  invited;  Force  planning  and 
operations;  acquisition;  test  and  evaluation; 


training;  and  deployment,  mobilization,  and 
sustainability. 

The  latest  session  of  the  SIMVAL 
Workshop  series  was  held  March  31  -  April 
2,  1992.  At  this  workshop,  model  verifica¬ 
tion,  validation,  and  accreditation  (VV&A) 
case  studies  were  discussed,  and  examples 
were  mapped  into  the  VV&A  elenienis 
defined  at  previous  meetings.  The  concept 
was  to  use  the  most  pertinent  portions  of  the 
case  studies  as  examples  of  specific  elements 
of  VV&A. 

1.2  W&A  IN  THE  SCHEME  OF 
PROBLEM  SOLUTION 

The  overall  process  in  which  VV&A 
plays  a  role  is  shown  in  Figure  1-3.  The 
process  begins  with  a  problem  or  set  of 
issues  that  need  to  be  addressed.  Using  the 
scientific  method,  the  problem  is  decom¬ 
posed  into  elements  which  lend  themselves 
to  investigation  or  analysis.  Each  of  these 
problem  elements  can  be  addressed  using 
different  approaches,  some  of  which  may  in¬ 
clude  modeling  and  simulation.  For  those 
that  are  supported  by  modeling  and  simula¬ 
tion,  a  set  of  requirements  for  the  model  to 
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Figure  1-2.  Senior  Advisory  Group  Members 


correctly  and  satisfactorily  address  the  ele 
ment  should  be  developed.  These  require¬ 
ments  for  all  elements  are  grouped  together 
and  become  the  application  requirements. 
The  application  requirements  are  then  used 
to  compare  candidate  models'  capabilities 
against  in  order  to  select  the  most  appropri¬ 
ate  mfxlel(s)  for  the  application.  The  model 
selection  considers  not  only  the  model  that 
best  satisfies  the  requirements,  but  also  the 
credibility  of  that  model  for  that  specific 
application.  This  model  selection  is  sup¬ 
ported  in  tenns  of  model  capability  and 
credibility  by  verification  and  validation. 
The  results  of  the  model  selection  process 
are  then  used  to  support  an  accreditation 
decision.  Only  after  this  consideration  and 
a  formal  accreditation  decision  has  been 
made,  should  the  model  be  applied  in  a 
simulation,  or  should  model  results  be  used 
to  support  an  acquisition  decision  at  any 
level. 

1.3  DEFINITIONS 

Basic  VV&A  definitions  were  devel¬ 
oped  during  initial  SIMVAL  meetings,  and 
then  used  throughout  the  workshop  series. 
There  was  nothing  dramatically  new  in  the 


definitions;  they  were  modifications  of  those 
currently  being  used  by  some  organizations. 
However,  they  were  fully  discussed  and 
honed  until  a  consensus  was  reached.  Other 
definitions  could  have  been  chosen  which 
are  adequate.  However,  the  goal  of  the 
SIMVAL  series  was  to  agree  on  a  common 
set  of  definitions  so  that  we  could  more 
clearly  and  easily  communicate. 

The  following  set  of  definitions  was 
developed  by  the  SAG  and  agreed  upon  by 
the  SIMVAL  participants: 


VERIFICATION:  The  process  of  de¬ 
termining  that  a  model  implementa¬ 
tion  accurately  represents  the  devel¬ 
oper's  conceptual  description  and 
specifications. 


Verification  consists  of  two  basic 
types.  Logic  verification  ensures  that  the 
basic  equations,  algorithms,  etc.,  are  cor¬ 
rect.  Code/object  verification  ensures  that 
these  representations  have  been  correctly 
implemented  in  the  computer  code. 
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FIGURE  1-3.  VV&A  in  the  Problem  Solution  Process 


VALIDATION:  The  process  of  deter¬ 
mining  the  degree  to  which  a  model 
is  an  accurate  representation  of  the 
real  world  from  the  perspective  of 
the  intended  uses  of  the  model. 


The  primary  change  in  this  definition 
is  to  recognize  that  validation  is  not  an 
event,  but  a  process  consisting  of  several 
steps  to  measure  the  degree  to  which  the 
mt^l  represents  the  real  world.  These 
steps  may  consist  of  face  validation,  data 
base  validation,  etc.  Complete  validation, 
i.e.,  ensuring  that  the  model  represents  the 
real  world  in  ail  aspects,  can  be  achieved 
only  for  simple  models,  since  complete 


validation  implies  that  the  model  can  be 
used  for  any  application.  A  complex  digital 
simulation  can  achieve  degrees  of  validation, 
but  complete  validation  is  a  goal  that  can 
probably  never  be  reached.  Therefore,  it 
would  be  improper  to  refer  to  such  a  model 
as  “validated." 


ACCREDITATION:  An  official  deter¬ 
mination  that  a  model  is  acceptable 
for  a  specific  purpose. 


Accreditation  is  a  decision  that  is 
based  on  a  number  of  different  factors, 
including  V&V.  It  accepts  that  a  given 
level  of  V&V  is  sufficient  for  a  model  to  be 
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FIGURE  1*4.  The  Relationships  of  Vfii  V  to  the  Model  Form 


used  in  a  particular  application.  For  some 
applications,  a  low  level  of  V&V  (for  exam¬ 
ple,  code  verification)  may  be  acceptable. 
For  other  applications,  a  more  rigorous 
validation  may  be  necessary.  Ar:reditation 
must  take  into  account  the  importance  of  the 
decision  in  determining  the  rigorousness  of 
V&V  required,  as  well  as  other  factors. 

1.4  THE  VERIFICATION  AND  VALI¬ 
DATION  STRUCTURE 

Verification  and  validation  are  com¬ 
parison  processes.  Verification  compares 
the  implementation  of  a  model  against  the 
intent  or  design  of  the  model.  Validation 
compares  the  model  against  the  real  world. 
Both  are  processes  that  establish  the  credi¬ 


bility  of  the  model  in  performing  certain 
functions.  Accreditation,  on  the  other  hand, 
uses  the  credibility  of  the  model  in  a  formal 
decision  as  to  whether  the  model  can  be 
used  for  a  specific  application.  The  rela¬ 
tionships  of  verification  and  validation  are 
shown  in  Figure  1-4. 

The  “Real  World“  at  the  top  of  the 
figure  denotes  the  actual  function  or  system 
which  is  being  modeled.  If  part  or  all  of  the 
function  or  system  does  not  exist  (e.g.,  a 
future  aircraft),  then  it  is  our  best  realization 
or  understanding  of  that  non-existent  func¬ 
tion  or  system.  From  the  Real  World,  the 
model  designed  selects  the  functions  and 
systems  that  are  important  to  the  class  of 
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problems  that  are  the  intended  application 
set  for  the  model.  This  then  is  the  basis  for 
the  Functional  Requirements. 

These  requirements  reflect  the  types 
of  functions  or  systems  to  be  modeled  (e.g., 
ECM,  command  and  control,  Ml  A2)  as  well 
as  an  indication  of  the  level  of  detail  desired 
(e.g.,  signal  level,  message  level,  operator 
level).  The  model  requirements  become  the 
first  category  of  model  structure  which  falls 
under  the  need  for  Documentation  and 
Configuration  Management.  For  a  further 
explanation  and  description  of  documenta¬ 
tion,  refer  to  Chapter  2.  Model  require¬ 
ments  should  be  documented  when  develop¬ 
ing  the  Model  Concept. 

The  Model  Concept  is  an  initial 
model  architecture  which  gives  a  consistent 
and  sufficient  relationship  description  be¬ 
tween  the  functions  and  systems  to  be  mod¬ 
eled.  The  model  concept  should  satisfy  all 
the  functional  requirements  the  modeler 
defined  earlier.  The  model  concept  is  then 
documented  (and  thereby  falls  under  config¬ 
uration  management)  and  becomes  the  basis 
for  the  Model  Design. 

TIte  Model  Design  is  the  structural 
outline  of  the  model.  It  defines  the  modeled 
system  elements,  their  functioning,  and  their 
interrelationships.  This  design  can  be  done 
in  successive  levels  of  detail  until  suflicient 
definition  is  available  to  translate  into  the 
Model  Code. 

The  Model  Code  is  the  model  in  its 
computer  language  form.  The  Model  Code 
finally  is  compiled  or  assembled  into  its 
computer  instruction  form,  the  Model  Ob¬ 
ject.  This,  along  with  the  data  required  to 


execute  the  model  and  to  simulate  the  de¬ 
sired  situation,  is  the  Computer  Model. 
The  Computer  Model  (or  some  functional 
elements  of  the  Computer  Model)  may  be 
implemented  in  hardware  or  performed  by 
humans  as  part  of  the  simulation. 

The  verification  processes  then  are 
the  checks  made  at  one  stage  of  model 
development  against  the  requirements,  de¬ 
sign,  and/or  fonn  of  earlier  stages  to  assure 
correct  translation.  For  example,  code 
verification  is  the  process  of  comparing  the 
model  code  stage  against  the  requirements 
and  specifications  of  the  model  design  to 
ensure  the  code  correctly  represents  the 
design.  For  further  explanation  and  descrip¬ 
tion  of  verification  processes,  refer  to  Chap¬ 
ter  3. 

The  validation  processes  compaie 
any  stage  of  model  form  against  the  real 
world.  For  example,  comparing  the  model 
design  against  the  functionality  of  the  real 
world  system  can  help  ensure  that  the  design 
represents  all  the  necessary  functions  of  the 
real  world  to  satisfy  the  uses  of  the  model. 
Another  validation  method  would  be  to 
compare  the  output  of  the  computer  model 
against  the  functional  perfonnance  of  the 
real  world  system  under  the  same  initial 
conditions.  For  further  explanation  and 
description  of  validation  processes,  refer  to 
Chapter  4. 

The  credibility  that  a  model  gains  by 
applying  verification  and  validation  pnKess- 
e$  is  part  of  the  input  to  the  accreditation 
decision.  This  was  shown  earlier  in  Figure 
1-3.  Accreditation  is  addressed  in  more 
detail  in  Chapter  S. 
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CHAPTER  U  -  THE  BASICS 

by  Joseph  J.  Cynamon 


2.0  INTRODUCTION 

The  basics  of  model  or  simulation 
verification,  validation,  and  accreditation 
(VV&A)  are  the  methods  and  tools  used  to 
track,  record,  and  control  the  model  devel¬ 
opment  and  VV&A  processes.  This  chapter 
discusses  some  of  these  methods  and  tools: 
documentation,  configuration  management, 
and  independent  review  of  the  model. 

The  message  delivered  by  this  chap¬ 
ter  is  that  VV&A  is  composed  of  a  series  of 
tasks  that  contribute  to  its  accomplishment. 
If  any  of  these  tasks  or  elements  is  not 
successfully  accomplished  and  completed, 
then  the  VV&A  enterprise  can  also  be  ex¬ 
pected  to  fall  short  of  its  goals.  VV&A  can 
be  thought  of  as  a  sum  of  many  elements. 
If  any  of  the  links  in  this  chain  of  elements 
is  violated,  VV&A  will  be  endangered. 


This  chapter  covers  in  detail  the  most  im¬ 
portant  elements  of  these  basics.  Figure  II- 1 
is  a  summary  flow  chart  used  to  identify 
some  of  the  basic  elements  ciucial  to 
VV&A.  It  also  highlights  the  feature  that 
each  element  contributes. 

•  Documentation  provides  the  descrip¬ 
tion  of  the  model  or  simulation,  its  require¬ 
ments,  how  it  operates,  and  its  characteris¬ 
tics,  algorithms,  and  intended  application(s). 
Dtxumentation  should  describe  the  history 
of  the  development  of  the  model  and  the 
methods  used  for  testing  its  functionality  and 
properties. 

•  Configuration  Management  (CM) 
provides  for  the  tracking  (i.e.,  an  audit  trail) 
of  the  development  of  the  mode!  or  simula¬ 
tion.  Each  of  these  functions  provides  an 


FIGURE  11-1.  The  Basics  Chart 
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information  source  for  VV&A. 

•  Finally,  Independent  Review  is  the 
process  in  which  an  impartial  expert  review- 
er(s)  conducts  a  critical  evaluation  of  both 
the  product  and  the  VV&A  process  per¬ 
formed  on  it.  This  review  should  be  done 
without  bias  and  reservation,  and  must  be 
conducted  independently  of  the  influence  of 
the  product  developer(s).  The  independent 
revie'wer(s)  should  have  full  access  to  all 
documentation  and  the  cooperation  of  the 
developer(s)  and  VV&A  participants.  This 
means  availability  of  all  levels  of  documen¬ 
tation,  total  availability  of  the  configuration 
resources,  and  complete  cooperation  of  the 
participants  for  consultation  with  the  review- 
er(s). 

2.1  DOCUMENTATION 

The  definition  of  documentation 
developed  by  the  SIMVAL  working  group 
follows; 


I  DOCUMENTATION:  Analyst  s  man¬ 
ual,  user's  guide,  programmer's 
manual,  etc.,  providing  the  math, 
program  structure,  assumptions  and 
algorithms  used,  including  documen¬ 
tation  of  procedures  and  results  of 
any  verification  and  validation  ef¬ 
forts. 

The  function  of  documentation,  under 
VV&A,  is  to  provide  information  about  all 
aspects  of  a  model’s  intended  application(s), 
description,  and  hi.story  to  the  professional 
community. 

The  flow  chart  in  Figure  11-2  identi¬ 
fies  the  critical  steps  in  the  dcKumentation 
process.  In  this  diagram  two  feedback  kxips 
are  identified.  The  first  loop  illustrates  the 


process  of  establishing  documentation  re¬ 
quirements  based  upon  the  development  of 
the  evolving  VV&A  requirements.  The 
second  loop  is  also  a  feedback  process  based 
upon  the  program’s  progress  and  the  imple¬ 
mentation  of  VV&A.  Its  purpose  is  to 
maintain  responsiveness  to  the  review  pro¬ 
cesses  as  additional  documentation  needs 
develop. 

Given  the  above  definition  of  docu¬ 
mentation,  Table  II- 1  details  some  of  its 
critical  elements  and  identifies  the  product 
or  process  needed  to  implement  them. 

2.1.1  Documentation  Application  Tech¬ 
niques 

Factors  that  have  the  largest  impact 
on  VV&A  documentation  support  include 
the  following: 

•  Traceability  to  model  requirements. 

•  Description  of  the  functional  design 
of  the  model. 

•  The  identification  of  the  models  and 
algorithms  used  in  the  specific 
application. 

•  Data  requirements  identified  for 
VV&A. 

•  Confinnation  that  the  required  as¬ 
pects  of  VV&A  are  fully  covered. 

•  A  record  of  the  history  of  the 
product’s  design,  development, 
testing  and  VV&A. 

Of  primary  importance  is  traceabili¬ 
ty.  Historical  records  should  be  maintained 
to  trace  requirements  with  product  devel¬ 
opment  (e.g.,  to  link  test  priKedures  to 
requirements  to  demonstrate  they  are  met). 
Documentation  provides  the  means  to  record 
and  review  the  goals  and  objectives  of  nuxl- 
el  development,  theraceability  of  test 
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FIGURE  11*2.  Critical  Steps  in  the  Process  of  Documentation 


results  with  performance  goals,  and  the 
fulfillment  of  functional  requirements. 

A  fundamental  requirement  is  that  an 
adequate  detailed  description  of  the  model 
be  given  in  its  documentation.  Is  there 
enough  of  a  description  to  meer  the  require¬ 
ments  disclosure?  Are  changes  made  to 
algorithms  and  math  models  recorded  com¬ 
pletely  and  in  a  timely  manner?  Have 
coding  revisions  and  changes  been  reflected 
with  proper  comments  and  logged  in  the 
source  code  and  accompanying  documents? 
Are  revisions  formally  not^  and  controlled 
in  the  documentation  so  that  users  are  prop¬ 
erly  alerted?  Have  the  developers  publish^ 
their  findings  of  verification  testing  of  key 
parameters  so  that  they  can  be  traced  to  the 
proper  performance  requirements? 


Table  11-2  provides  a  list  of  mini¬ 
mum  requirements  for  verifying  traceability 
of  VV&A  documentation.  This  list  can  be 
used  to  check  for  documentation  items  that 
can  be  regarded  as  a  minimum  functional 
package  to  support  VV&.A. 

Complete  and  understandable  docu¬ 
mentation  of  the  model's  functional  opera¬ 
tion  is  essential.  Designing  documentation 
that  describes  functionality  takes  creativity 
and  initiative,  keeping  in  mind  that  informa¬ 
tion  must  be  sent  to  a  user  or  to  the  applica¬ 
tion  community  in  concise  and  understand¬ 
able  language.  It  is  recommended  that 
liberal  use  be  made  of  illustrations,  dia¬ 
grams,  and  charts.  It  is  also  recommended 
that  functional  block  diagrams  organized 
into  multiple  levels  for  systematic  detailing 
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Table  11-1.  Documentation  Needs 


ITEM 

DESCRIPTION 

Technical  description 

Manuals,  text  books,  diagrams  and  charts,  pic¬ 
tures  and  illustrations,  and  photographs. 

Description  of  uses 

Demonstrations,  training  courses,  user  and  appli¬ 
cations  examples,  on-line  user  manuals. 

Installation  procedures 

Installation  procedures,  automated  installations 
processes,  vendor-supplied  support. 

Installation  validation 

Examples  to  exercise  critical  functions  and  I/O 
supplied  data  functions  for  automated  installation 
comparisons. 

Model  design  requirements 

Design  goals  and  specifications  that  are  criteria 
for  VV&A  evaluation  and  acceptance. 

Algorithm  and  math  Model 

Manual  containing  the  algorithm  reference  sourc¬ 
es,  assumptions  and  known  limitations,  and  any 
supporting  performance  analysis. 

Computer  program  design 

Description  in  words,  diagrams,  charts,  tables, 
and  pictures  both  in  hardcopy  and  in  electronic 
storage. 

Testing  of  the  model 

The  purpose  and  description  of  the  tests,  the  key 
parameters  to  test,  the  analysis  for  setting  the 
acceptance  of  performance,  the  ranges  of  the 
acceptance  of  the  results  and  the  testing  proce¬ 
dure  description. 

Training  jf  operators 

Instructor-lead  course  materials,  computet  pro¬ 
grammed  instructive  course,  manuals  and  charts. 

1  Dictionary  of  terms 

Definitions  of  terms  and  parameters,  tables  of 
interrelationships  of  variables,  arrays,  and  com¬ 
puter  subroutines. 

of  the  model  be  used.  At  the  top  level,  the 
diagram  can  begin  with  a  general  architec¬ 
tural  description  of  the  model.  This  can  be 
followed  by  a  detailed  tree  diagram  describ¬ 
ing  the  sequence  of  executable  routines.  A 
functional  block  diagram  can  then  follow. 


describing  the  sequence  of  operations  per- 
fonned  thixnigh  the  various  modes  and  pro¬ 
cesses  during  execution.  Other  useful  dia¬ 
grams  are  looping  charts,  which  identify  the 
logical  functions  controlling  the  branching  of 
sequence  of  the  nuxlel  pr(x:esses  and  condi- 
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Table  11-2.  Recommended  Minimum  Requirements  For  Traceability 


• 

A  checklist  of  design  goals  and  objectives  for  the  model. 

• 

A  dictionary  of  key  parameters. 

• 

A  matrix  that  identifies  quantitative  performance  values  and  their 
acceptance  ranges  correlated  with  the  requirements. 

• 

A  detailed  description  of  all  algorithms  and  math  models  used. 

• 

A  list  of  all  tests  performed  and  their  results  in  a  matrix  that  corre¬ 
lates  these  results  with  the  requirements. 

• 

A  functional  description  of  the  model  that  can  be  correlated  with 
the  requirements  and  specifications. 

• 

An  historical  listing  of  design  and  development  modifications  made 
to  the  source  code,  and  the  rationale  for  the  changes. 

tions  for  switching.  Looping  diagrams  can 
show  parameter  and  variable  processing 
through  the  subroutines  of  the  model.  They 
identify  the  interface  details  and  many  of  the 
key  transfer  characteristics  processes.  In 
addition,  they  can  be  made  to  provide  many 
otner  key  computational  characteristics  and 
functional  routines,  such  as  array  size,  word 
length,  processing  speeds,  and  processing 
conditions.  Signal  flow  diagrams  also  may 
be  used  to  describe  opeiations  quickly. 
They  can  be  used  to  quickly  point  up  perti¬ 
nent  model  characteristics  to  the  reviewer(s). 
The  process  can  easily  be  supported  by  text 
descriptions  that  provide  additional  details 
about  each  element  in  the  diagram.  Table 
II-3  summarizes  the  aforementioned  dia¬ 
grams,  de.scribing  their  functions  and  some 
key  characteristics  they  provide. 

Spread  sheets  provide  another  useful 
format  for  organizing  model  descriptions. 
Subroutine  calls  can  be  itemizerj  with  a  brief 
functional  description.  These  types  of  charts 
could  be  used  to  identify  each  routine,  the 
routine  that  calls  it,  and  the  routines  that  it 
calls.  It  could  introduce  the  arguments 


passed  to  and  from  the  routine,  and  identify 
the  parameters  passed  to  and  from  it  through 
common  blocks  and  structure  statements. 
The  advantage  of  this  format  is  that  various 
sorting  techniques  can  be  automated  to 
recover  the  specific  interrelationships  of 
structure  and  characteristics. 

Additional  computer-automated  tools 
also  can  be  used  to  analyze  and  produce 
documents  that  describe  extended  and  com¬ 
plex  models.  Some  of  these  tools  produce 
tables  that  catalog  routines  and  their  inter¬ 
faces,  their  parameters,  their  arrays,  and 
their  variables  associated  with  the  appropri¬ 
ate  subroutines.  They  can  analyze  the 
program’s  operation  and  timing,  searching 
for  programming  errors  and  warning  of 
possible  conflicts.  TIv’iy  also  analyze  the 
program's  stati.stics  and  processing  efficien¬ 
cy,  making  recommendations  for  improving 
the  nmning  perfomance.  The  information 
collected  can  be  stored  in  separate  files  for 
later  review,  in  support  of  VV&A,  and  can 
be  used  to  sort  through  the  tables  for  specif¬ 
ic  identification  of  the  program's  interrela¬ 
tionships. 
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Table  11-3  Summary  Table  of  Block  Diagrams 


ITEM 

FUNCTION 

CHARACTERISTICS 

1. 

Overall  architecture 

Provide  a  list  of  pro¬ 
cesses  used,  the  order 
of  their  use,  and  the 
conditions  for  their 
selection. 

Show:  preprocessor  for  condi¬ 
tioning  input  data,  command  file 
for  executing  the  primary  pro¬ 
gram,  post  processor  for  interac¬ 
tively  conditioning  the  output 
file,  graphic  program  for  produc¬ 
ing  user  information  and  observa¬ 
tions,  and  interactive  integrator 
for  allowing  users  to  interface 
with  program's  execution. 

2. 

Tree 

Identify  subroutines 
used  in  all  of  the  pro¬ 
gram  phases,  showing 
sequencing  and  condi¬ 
tions  for  selection. 

Program  phases  are:  Initializa¬ 
tion,  input  reads,  opening  of  the 
output  data  files,  ordering  the 
processes  and  rules  of  perfor¬ 
mance,  the  operational  phase, 
the  termination  and  file  closing 
phase. 

3. 

Functional  block 

Describe  operation  se¬ 
quence,  modes,  and 
processes  performed. 

Radar  simulation  would  have  a 
sequence  of  operations  that  in¬ 
clude  the  waveform  generation, 
the  transmitter  power  modulator, 
the  antenna  scan  and  coverage 
processing,  the  receiver,  the  IF 
mixer,  the  signal  processing,  etc. 
The  system  modes  might  include 
the  search  phase,  an  acquisition 
phase,  and  a  tracking  phase. 

1 

Looping 

Highlights  logical  func¬ 
tions  and  branching 
conditions  and  se¬ 
quencing  based  upon 
inputs. 

These  are  specialized  diagrams 
that  might  provide  useful  infor¬ 
mation  to  correlate  with  the  sys¬ 
tem  requirements. 

5. 

Signal  flow 

Provide  observers  with 
direct  contact  with  the 
code  operations  that 
include  insights  to 
functions,  logic,  and 
branching  of  signals 
being  processed. 

This  diagram  might  provide  the 
operation  of  routines,  identifying 
their  functions,  the  parameters 
they  are  processing  and  their 
parameter  characteristics. 

n 


-6 


Table  11-4.  Suggested  Format  for  Historical  Records  of  Model  Testing 


ITEM 

DESCRIPTION 

Purpose  of  tests 

Identify  the  test  requirements  from  the  design  specification. 

Description  of 
test 

Describe  the  method(s)  to  be  used  in  performing  the  test, 
including:  the  dynamics,  observation  intervals,  range  of  val¬ 
ues  to  be  tested,  the  parameter  constraints  of  the  input  to  be 
applied,  and  how  they  were  controlled. 

Record  the 
results 

Record  results  as  they  occur,  over  time  and  space,  recording 
all  conditions  and  constraints  observed  during  testing. 

Post  the  Analysis 

Provide  all  analysis  computations  performed,  including  the 
algorithms  and  the  math  processes  used.  Present  results  in 
the  format  as  specified  by  the  documented  requirements. 

Comprehensive  historical  documenta¬ 
tion  for  all  tests  performed  during  the 
product’s  development,  including  a  descrip¬ 
tion  of  the  purpose  of  the  tests,  the  test 
results,  and  tiie  analysis  of  these  results,  is 
essential.  Historical  documentation  will  tie 
testing  to  the  specific  design  requirements  of 
the  product.  All  of  this  should  be  incorpo¬ 
rated  into  the  design  and  development  log¬ 
ging  document  that  will  be  one  of  the  bases 
for  verification  and  independent  review. 
The  document  must  be  archived  for  future 
review  by  developers  and  users  as  a  basis 
for  additional  new  model  applications. 

Table  II-4  suggests  a  format  for  the 
historical  recording  of  the  development 
testing  requirements.  It  is  not  complete  but 
does  identify  the  functions  and  descriptions 
of  some  major  elements. 

The  documentation  should  describe 
the  math  models  and  algorithms  used.  In 
describing  these,  connections  should  be 
made  to  the  coding  process  used,  to  the 
assumptions  and  conditions  for  them  to  be 
appropriate,  and  to  the  operational  modes 
for  which  they  are  applied.  For  instance,  to 


calculate  a  state  vector  of  any  object,  a 
numerical  integration  process  is  required. 
The  implementation  of  that  process  should 
be  discussed  and  its  selection  justified.  If 
coordinate  frame  changes  are  needed,  a 
discussion  of  the  choice  of  frames  should  be 
given  with  the  derivations  of  their  equations. 
Iri  addition,  the  use  of  classical  performance 
relationships  should  be  specified.  For  ex¬ 
ample,  in  the  use  of  the  radar  range  equa¬ 
tion  for  predicting  radar  performance,  the 
parameters  and  their  appropriateness  to  that 
application  should  be  justified.  It  is  recom¬ 
mend  that  an  analysis  manual  be  organized 
around  the  use  of  functional  flow  diagrams. 

Table  II-5  highlights  the  types  of 
descriptive  information  that  a  reviewer  needs 
to  assess  design  adequacy. 

It  is  also  recommended  that  a  set  of 
documents  be  created  specifically  to  support 
VV&A.  This  set  of  documents  includes  the 
verification  and  validation  plan,  verification 
and  validation  report(s),  and  any  accredita¬ 
tion  decision  and  decision  support  dcxu- 
ments  for  previous  applications  of  the  mcxlel 
or  simulation.  Further  details  of  the  con- 
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Table  il-5.  Checklist  of  Model  Description  Entries 


Develop  a  systematic  list  of  model  operations  performed  and  identi¬ 
fy  the  algorithms  and  math  models  used. 

• 

Each  of  the  algorithms  and  math  models  should  have  identified 
references  justifying  their  use,  or  derivations  from  known  and  sound 
physical  principles.  All  assumptions  and  limitations  must  be  identi¬ 
fied  along  with  any  analysis  of  applicability. 

• 

To  support  algorithm  analysis,  diagrams  identifying  time  and  space 
parameters  and  their  functions  must  be  included. 

• 

The  coding  implementation  of  these  algorithms  should  also  be  de¬ 
scribed.  An  analysis  investigating  the  computational  accuracy  sup¬ 
porting  the  coding  algorithm  should  also  be  given. 

• 

Conditional  descriptions  for  the  computation  must  also  be  included. 
This  includes  coordinate  computation  frames,  time  lines,  conditional 
events  and  modes,  and  sequences  that  are  conditioned  on  events. 

tents  of  these  documents  are  discussed  in 
Chapters  4  and  5. 

2.1.2  Documentation  Strengths 

VV&A  may  be  improved  by  impos¬ 
ing  arid  enforcing  sound  documentation 

principles.  Some  of  the  advantages  gained 
by  maintaining  documentation  are  identified 
as  follows: 

•  Having  documented  records  of  the 
products  developed  and  tested  ensure 
traceability  to  requirements. 

•  Providing  resources  for  storing  and 
maintaining  library  facilities  in  a 
designated  archive  with  defined 
control  procedures  assures  infonna- 
tion  security. 

•  Setting  minimal  standards  for  de¬ 
scribing  and  updating  model  applica¬ 
tions  assures  that  historical  tracking 
and  extended  life  potential  will  ac¬ 


company  model  reuse. 

•  Establishing  a  facility  to  store  docu¬ 
mented  records  of  VV&A  events  and 
conclusions  provides  future  applica¬ 
tions  with  traceability  of  VV&A. 

The  location  for  storing  documenta¬ 
tion  should  be  central  and  controlled  to 
assure  protection  against  loss  and  tampering, 
and  accessible  to  all  model  reviewers  and 
users.  Once  a  document  is  approved  for 
archiving,  it  becomes  protected  and  only  the 
designated  control  group  is  authorized  to 
make  and  promulgate  revisions. 

Table  II-6  summarizes  the  essential 
elements  of  archiving  in  terms  of  the 
reporter’s  who,  what,  when,  where,  and 
how. 

2.1.3  Documentation  Limitations 

The  limitations  that  can  have  an 
impact  on  documentation  include: 


Table  11-6.  Essential  Elements  of  Systematic  Archiving 


QUESTION 

ELEMENTS 

WHO? 

a. 

The  model  procurer 

b. 

The  developer 

c. 

The  user 

d. 

The  independent  reviewer 

e. 

The  accreditation  team. 

WHAT? 

a. 

Requirements,  plans,  budgets,  identifications,  schedules,  and  appli¬ 
cations. 

b. 

Description,  operation,  design,  test  procedure,  test  data,  and  re¬ 
ports/reviews. 

c. 

Applications,  user  form  reports,  improvements,  and  operational 
limitations. 

d. 

Information  used  in  reviews,  reviewer  notes,  conclusions,  and 
recommendations. 

WHEN? 

a. 

Start  at  the  program  outset  through  all  phases  of  program  develop¬ 
ment  and  applications. 

b. 

Design  phase,  integration,  testing,  and  engineering  development. 

c. 

User  group  meetings,  application  experience  during  testing,  and 
user  improvements  and  extensions. 

d. 

During  planning  and  scheduling,  data  collecting,  and  reporting. 

WHERE? 

All  elements  become  arch:  ^d  by  configuration  management  team 
in  their  facilities. 

HOW? 

All  elements  should  be  stored  in  both  electronic  and  hardcopy 
forms. 

•  Model  or  simulation  development 
programs  do  not  always  provide 
sufficient  funding,  schedule,  and 
manpower  for  documentation. 

•  The  caliber  of  documentation  often 
depends  upon  the  caliber  of  the 
program  requirements. 


•  Poor  documentation  is  almost  always 
accompanied  by  poor  configuration 
management,  and  vice  versa. 

•  When  program  cutbacks  occur,  docu¬ 
mentation  and  configuration  manage¬ 
ment  are  usually  the  first  to  be  re¬ 
duced. 


Inadequate  documentation  makes 
VV&A  more  difficult. 
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Validation  and  accreditation  rely 
heavily  on  the  quality  of  the  documentation 


of  the  model.  VV&A  must  be  supported 
with  real  project  dollars,  near-term  and 
long-tenn  scheduling  to  integrate  the  docu¬ 
mentation  with  each  phase  of  the  program, 
and  dedicated  resources  and  staff. 

One  continuing  difficulty  in  setting 
up  documentation  requirements  for  hardware 
procurement  programs  is  that  many  engi¬ 
neers  do  not  take  the  time  to  document  their 
work.  However,  it  is  expected  that  new 
emphasis  will  be  placed  upon  making  time 
for  documenting  one’s  work,  and  the  techni¬ 
cal  staffs  supporting  the  model  will  require 
good  documentation  skills. 

2.1.4  Documentation  Lessons  Learned 
From  looking  at  projects  that  did 
receive  complete  documentation  as  well  as 
those  that  did  not,  the  following  are  some  of 
the  lessons  learned  at  the  SIMVAL  work¬ 
shop  series: 

•  If  the  mode)  is  not  documented  with 
comprehensive  top  level  overview 
descriptions,  the  review  process  is 
more  difficult. 

•  The  lack  of  historical  records  of  the 
product’s  development  defeats  the 
traceability  required  by  VV&A. 

•  The  model’s  documentation  must  be 
tailored  to  the  application’s  require¬ 
ments  and  must  describe  the  specifics 
of  assumptions  made. 


•  The  documentation  must  incorporate 
inputs  and  suggestions  from  both  the 
developer  and  the  users. 

•  The  support  documentation  also  must 
focus  on  user  needs.  These  include 
describing  the  installation  and  opera¬ 
tion  of  the  product,  descriptions  of 
potential  machine  hardware  impacts 
on  hosting  the  product,  and  the 
supply  of  a  typical  example  for 
turnkey  operation  checks. 

•  In  the  practice  of  planning  a 
product’s  development,  funding  for 
the  documentation  must  begin  at  the 
start  of  the  program. 

Many  have  experienced  model  docu¬ 
mentation  that  provided  volumes  of  detail 
but  lacked  a  useful  top  level  overview  or 
executive  summary.  The  overview  needed 
should  answer  the  following,  questions: 
What  does  the  model  do?  How  will  it 
function?  What  does  it  need  as  inputs? 
What  will  it  provide  as  outputs?  What  are 
its  internal  functions  doing?  With  what 
level  of  fidelity  does  it  perform  these  func¬ 
tions?  How  have  these  functions  been 
shown  to  be  representative  of  the  actual 
application?  What  is  needed  to  use  this 
model?  What  assumptions  were  made? 
What  factors  that  cause  deviation  were 
neglected? 
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2.2  Configuration  Management 


CONFIGURATION  MANAGE¬ 
MENT  (CM)  is  a  discipline  apply¬ 
ing  technical  and  administrative 
oversight  and  control  to  identify 
and  document  the  functional 
requirements  and  capabilities  of  a 
model  and  its  supporting  data¬ 
bases,  control  changes  to  those 
capabilities,  and  document  and 
report  the  changes  as  required  by 
VV&A.  Configuration  manage¬ 
ment  includes  ensuring  the  de¬ 
tailed  design  and  the  computer 
source  code  of  the  model  are 
properly  documented  and 
tracked. 


The  most  important  responsibility  of 
the  CM  group  will  be  the  maintenance  of 
version  control  of  source  code  for  the  pro¬ 
ject. 

In  Figure  II-3,  a  flow  chart  is  pre¬ 
sented  that  highlights  some  of  the  principle 
processes  over  which  configuration  manage¬ 
ment  has  cognizance,  including: 

•  The  storage  and  maintenance  of  the 
model  requirements, 

•  The  storage  and  oversight  of  the 
model  revisions  and  upgrades, 

•  The  library/archiving  of  descriptive 
materials,  records,  reports,  and  pro¬ 
gram  documents. 


FIGURE  11*3.  Levels  of  Configuration  Management 
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The  process  of  configuration  manage¬ 
ment  can  be  performed  at  least  at  three 
interest  levels:  the  designer/procurement 
level,  the  developer  level,  and  the  user  com¬ 
munity  level.  Although  all  levels  are  shown 
to  share  common  interest  in  the  five  basic 
functions,  the  perspective  and  actual  content 
of  materials  emphasized  for  each  level  may 
differ  slightly.  Although  the  users  of  the 
CM  facility  are  shown  as  separate  entities, 
all  require  access  to  subsets  of  the  same  set 
of  documentation  maintained  by  the  configu¬ 
ration  manager. 

Implementing  a  CM  group's  policy 
requires  that  project  support  and  funding  be 
established  at  the  beginning  of  the  program. 
Rigorous  procedures  must  be  adopted  and 
strictly  adhered  to.  Administration  of  these 
policies  should  be  exclusive  of  other  work 


ing  program  functions  in  order  to  exercise 
independence.  Items  to  be  managed  and 
archived  by  the  CM  group  should  include: 
model  requirements  and  specifications; 
model  source  code,  description  and  dcKu- 
mentation;  test  plans  and  reports;  operating 
procedures;  input  and  output  database  infor¬ 
mation;  and  performance  infomiation. 

Tracking  and  archiving  the  historical 
development  of  the  product  is  another  im¬ 
portant  function  served  by  the  CM  group. 
The  most  important  responsibility  of  the  CM 
group  will  be  the  maintenance  of  version 
control  of  source  code  for  the  project. 

Table  11-7  presents  a  summary  of  the 
elements  and  a  functional  description  of  CM 
in  support  of  VV&A. 


Table  11-7.  Essential  Elements  of  Configuration  Management  for  VV&A 


OBJECT 

DESCRIPTION 

Requirements 

' 

Requirements,  specifications,  mode!  description,  and 
goals  and  objectives  of  the  model  design. 

Descriptions 

All  documentation  that  describes  the  functions,  opera¬ 
tions,  and  the  testing  to  verify  the  model  design  meets 
requirements. 

Model  Revisions 

Maintain  independent  archiving  of  code  and  upgraded 
versions  to  track  program  progress. 

Test  and  Verification 

Maintain  archive  of  all  testing  procedures  and  results 
from  all  phases  of  the  model  development,  modification, 
and  application  life  cycle. 

Library  Correspon¬ 
dence 

Maintain  an  archive  that  traces  the  history  of  the  model 
and  model  reports  from  design  and  development  phases, 
through  user  applications,  reviews,  and  community  expe¬ 
rience. 

VV&A 

Archive  all  plans,  reports,  correspondence,  reviews,  and 
findings  associated  with  VV&A. 
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2.2.1  Configuration  Management  Appli¬ 
cation  Techniques 

Items  that  are  critical  to  maintaining 
configuration  management  include: 

•  Maintaining  archives  consisting  of 
I/O  databases  and  test  data  collected 
during  developmental  and  verifica¬ 
tion  phases. 

•  Tracking  and  archiving  the  results  of 
the  overall  VV&A,  recording  each 
project  phase  and  VV&A  event. 

•  Archiving  all  project  requirements 
and  measures  of  perfonnance. 

A  clear  definition  of  the  configura¬ 
tion  management  process  for  the  project  is 
required  as  part  of  the  archival  collection.  It 
should  express  the  ranges  of  responsibilities 
assigned  the  CM  group  and  the  functions 
that  the  group  must  perform  to  support  the 
project.  The  CM  group  will  nonnally  create 
its  own  archival  procedures,  computer  archi¬ 
tecture,  and  file  structures.  It  will  create  a 
document  defining  the  methods  it  adopts  so 
all  project  activities  can  review  and  under¬ 
stand  the  CM  environment  and  its  infor¬ 
mation  requirements. 

Exceptions  to  the  data  format  control 
by  the  CM  are  the  database  storage  and  test 
data  formats.  These  formats  are  created  by 
the  developer,  user,  and  the  VV&A  commu¬ 
nities  t  It  the  needs  for  their  responsibili¬ 
ties.  The  CM  will  be  responsible  for  ac¬ 
cepting  this  data  and  creating  an  archival 
environment  to  maintain  and  protect  the 
contents  of  these  files  without  alteration,  for 
future  retrieval  and  assessment.  The  I/O 
and  test  data  archives  should  be  available  to 
all  project  members,  subject  to  security 
sensitive  need-to-know  clearance. 


The  CM  will  maintain  the  results  of 
V'V&A  Dy  tracking  and  archiving  each 
VV&A  milestone  or  event.  This  informa¬ 
tion  contains  V&V  plans  and  reports,  ac¬ 
creditation  decisions,  and  the  results  of  all 
independent  reviews. 


2.2.2  Configuration  Management 
Strengths 

The  CM  improves  VV&A  in  the 
following  ways: 


•  A  complete  historical  record  of  the 
model's  development  and  application 
history  facilitates  VV&A. 


The  CM’s  automated  facilities  are 
responsive  and  flexible  in  resprmding 
to  inquiries. 


•  The  CM  provides  documentation 
security. 


•  The  CM  can  maintain  support  for 
multi-level  secure  versions. 


The  advantages  of  using  centralized 
computer  storage  by  the  CM  to  track  the 
model  programs  are  clear.  The  process  of 
keeping  a  historical  log  of  the  various  ver¬ 
sions  of  the  project's  model,  independently 
of  the  developers  influence,  creates  opportu¬ 
nity  to  protect  against  file  losses  and  corrup 
tion.  The  automated  storage  environment  of 
documentation  makes  timely  access  to  infor¬ 
mation  within  reach  of  the  other  project 
members. 

Another  strength  associated  with  an 
independent  CM  group  is  the  ability  to 
provide  documentation  security,  avoiding 
tampering  with  or  loss  of  infonnation  need¬ 
ed  in  VV&A.  A  systematic  control  of  the 
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filing  of  information,  isolated  from  the 
project’s  routine  activities,  protects  against 
data  loss  and  contamination. 

The  tracking  of  test  results  with  the 
model  version  used  to  create  the  data  is  a 
significant  strength  of  the  CM  facilities’ 
support  of  the  review  process.  The  storage 
of  older  versions  in  a  secure  environment  is 
the  safety  net  for  catastrophes  that  could  and 
do  befall  the  projects.  Also,  having  ar¬ 
chived  multiple  versions  of  software  can 
allow  users  and  reviewers  to  reconstruct  the 
rationale  for  past  decisions. 

2.2.3  Configuration  Management  Limita¬ 
tions 

Some  of  CM’s  limitations  include: 

•  The  CM  group  has  only  the  Govern¬ 
ment  software  code  requirements  and 
standards  to  follow,  instead  of  broad¬ 
er  system  standards. 

•  The  CM  needs  a  repository  for  the 
model  and  the  documentation,  sepa¬ 
rate  from  other  project  activities. 

•  The  CM  group  has  not  always  been 
recognized  as  an  essential  activity  in 
the  support  of  VV&A. 

The  CM  group  must  follow  existing 
software  specifications  that  are  defined  by 
the  government  regulations.  However,  these 
software  specifications  alone  are  insufficient 
to  accommodate  the  information  needs 
demanded  by  VV&A. 

The  cost  of  configuration  manage¬ 
ment  is  increased  by  of  the  need  for  separate 
storage  and  operating  facilities.  The  argu¬ 
ment  must  be  made  that  project  isolation  and 
security  Justify  the  investment.  If  VV&A  is 


to  be  established  as  a  qualification  of  a 
model’s  application,  then  a  CM  facility  must 
be  accepted. 

2.2.4  Configuration  Management  Les¬ 
sons  Learned 

The  following  are  lessons  learned 
about  configuration  management  and 
VV&A: 

•  The  CM  process  must  be  allowed  to 
evolve  over  the  program’s  develop¬ 
ment  and  must  review  materials  for 
backwards  compatibility  (i.e.,  must 
"benchmark"  materials). 

•  The  CM  must  maintain  materials  in 
such  a  form  that  they  can  be  trace¬ 
able  to  program  design  requirements. 

•  CM  facilities  should  be  maintained  to 
support  the  three  interest  levels  of 
program  activities  and  should  be 
tailored  to  support  functions  applica¬ 
ble  to  each  of  these  three  interest 
groups  (see  Figure  2.3). 

•  The  CM  should  be  endowed  with 
facilities  to  support  his  assigned 
responsibilities  and  have  the  authori¬ 
ty  to  carry  out  these  responsibilities. 

•  The  CM  should  maintain  close  con¬ 
tact  with  the  user  community  by 
participating  with  user  groups  and 
developing  a  repository  of  user 
lessons  learned. 

•  The  CM  should  maintain  a  knowl¬ 
edgeable  staff  that  can  support  ade¬ 
quately  the  potentially  immense 
library  of  technical  materials  sup¬ 
porting  a  model  or  simulation. 
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2.3  Independent  Review 


The  CM  must  be  sensitive  to  the 
application  process  of  the  model  and  recog¬ 
nize  the  user  interaction.  Once  the  product 
is  released  to  the  community,  the  scrutiny  it 
receives  will  require  much  more  configura¬ 
tion  control.  The  CM  must  provide  expand¬ 
ed  services  to  the  users,  understand  their 
viewpoint  and  application  evolvements,  and 
support  the  cataloging  of  their  experiences. 
The  archiving  of  the  user's  data  and  docu¬ 
mentation  (reporting  their  experience  and 
tracking  any  changes  made  in  revising  the 
product)  will  be  a  significant  function  of  the 
CM.  It  will  require  the  CM  to  participate  in 
user  workshops  and  working  groups,  and 
gain  coopeiative  support  from  the  user.  The 
CM  should  maintain  files  of  the  user  ver¬ 
sions,  and  expect  to  create  documentation  to 
support  these  revisions.  When  possible,  the 
developer  should  be  keep  abreast  of  devel¬ 
opments  by  the  CM,  and  vice  versa. 

In  our  experience,  unless  the  pro¬ 
gram  has  funding  to  maintain  developer 
involvement  for  upgrading,  the  developer 
usually  steps  out  of  the  program.  This 
leaves  the  program  office  to  manage  the 
configuration.  Unless  the  project  organiza¬ 
tion  recognizes  the  neetl  for  fonnal  configu¬ 
ration  management,  it  may  not  happen. 
Because  many  program  offices  have  continu¬ 
ous  personnel  changes,  the  configuration 
continuity  is  soon  lost.  With  the  require- 
-ment  for  VV&A,  the  configuration  manage¬ 
ment  process  will  be  required  throughout  the 
life  of  the  product  and  must  be  planned  well 
and  funded  adequately. 


INDEPENDENT  REVIEW  is  per¬ 
formed  by  competent  objective 
reviewers  who  are  independent 
of  the  model  developer.  It  in¬ 
cludes  either  (a)  a  detailed  verifi¬ 
cation  and/or  validation  of  the 
model;  or  (b)  an  examination  of 
the  verification  and/or  validation 
performed  by  the  model  develop- 


Figure  n-4  summarizes  the  infor¬ 
mation  an  independent  review  team  needs  in 
order  to  make  a  reasonable  assessment  of 
the  model.  Together,  these  items  compose 
the  processes  and  functions  that  have  been 
discussed  in  the  previous  sections  on  docu¬ 
mentation  and  configuration  management. 
They  are  seen  as  essential  information  inputs 
to  the  independent  reviewer,  and  if  an  at¬ 
tempt  to  accredit  a  model  is  made  without 
them,  the  process  would  be  much  more 
difficult. 


Additional  discussion  on  the  review 
processes  that  lead  to  accreditation  is  con¬ 
tained  in  Chapter  3,  "Verification,"  Chapter 
4,  "Validation,"  and  Chapter  5,  Accredita¬ 
tion." 


FIGURE  11-4.  Information  Resources  Needed  by  the  Independent  Review  Team 
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CHAPTER  HI  -  VERIFICATION 


by  Jim  Metzger 


3.0  INTRODUCTION 
3.0.1  Overview 


VERIFICATION  is  the  process  of  determining  that 
the  implementation  of  the  model  or  simulation 
accurately  represents  the  developer's  desaiption 
and  specifications. 


Figure  1-4,  in  Chapter  I,  illustrates 
how  verification  fits  into  an  overall  V&V 
framework.  This  figure  is  reproduced 
below  for  the  reader’s  convenience.  The 
primary  verification  methods  —  logical 
verification,  code  verification,  data  verifica¬ 
tion,  and  specific  logic  or  assumption  com¬ 
parison  —  are  discussed  in  this  chapter. 
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FIGURE  1-4.  The  Relationships  of  V&V  to  the  Model  Form 


3.0.2  Terminology 

To  provide  a  common  basis  for 
understanding,  additional  terminology  is 
introduced  below.  Referring  again  to  figure 
1-4,  the  development  of  a  model  or  simula¬ 
tion  involves  preparing  functional  require¬ 
ments,  developing  a  model  concept,  and 
then  proceeding  through  preliminary  design, 
detailed  design,  and  (possibly)  pseudo-code 
to  the  objective  computer  model. 

•  Functional  requirements.  This  is  a 
statement  of  user  requirements  for 
what  is  to  be  represented  in  the 
model.  It  is  an  extract  of  the  rele¬ 
vant  portions  of  the  real  world. 
Ideally,  it  is  documented  in  a  written 
report. 

•  Model  concept.  This  is  a  statement 
of  the  content  and  internal  relation¬ 
ships  of  what  is  to  be  represented  in 
the  model.  It  represents  the 
developer’s  concept,  includes  logic 
and  algorithms,  explicitly  recognizes 
assumptions  and  limitations,  and  is 
documented  in  a  written  report. 

•  Model  design.  This  is  a  highly 
detailed  description  of  the  model, 
including  descriptions  of  algorithms, 
logic  and  data  flow,  input  and  output 
data,  and  assumptions  and  limita¬ 
tions.  It  may  be  preceded  by  the 
intermediate  step  of  a  preliminary 
design  drawn  from  the  model  con¬ 
cept.  The  model  design  is  docu¬ 
mented  in  a  written  report. 

•  Model  code  tor  mode'  implementa¬ 
tion).  This  is  the  compilable  com¬ 
puter  code.  That  code  may  be  pre¬ 
ceded  by  the  intermediate  step  of 
pseudo-code,  a  computer-readable 


but  not  yet  compilable  implementa¬ 
tion  of  the  model  design.  The  term 
"model  implementation"  applies  to 
the  case  of  a  hardware  or  human-in- 
the-loop  simulation. 

•  Computer  model  or  simulation.  This 
is  the  compiled  and  executable  ver¬ 
sion  of  the  code,  including  the  spe¬ 
cific  computer  hardware  upon  which 
that  code  is  implemented.  This  is 
the  final  model  to  which  VV&A 
apply.  Note  that  a  distinction  is 
made  between  the  code  and  its  im¬ 
plementation  on  specific  hardware 
because  that  hardware  (and  as.sociat- 
ed  representations  of  arithmetic)  can 

affect  results. 

* 

3.0.3  Sample  Application 

To  illustrate  verification  methods  and 
associated  techniques  and  tools,  a  sample 
application  will  be  referenced  repeatedly  in 
this  chapter.  This  involves  V&V  performed 
by  a  contractor  on  the  preproces.sor  for  a 
major  Department  of  Defense  (DoD)  force 
level  model.  That  model  accepts  approxi¬ 
mately  2S  million  bytes  of  input  data  for  a 
particular  scenario.  The  preprocessor,  a 
fully  computerized  process  (if  not  a  model 
in  the  purest  sense),  had  been  built  and 
expanded  over  time  by  many  programmers 
and  had  not  been  subject  to  configuration 
control  procedures.  TTie  resulting  prepro¬ 
cessor  was  not  reliable,  efficient,  nor  well 
documented;  and  was  no  longer  a  timely 
method  of  preparing  input  data  for  the  force 
level  model.  The  objectives  of  the  V&V 
effort  were,  then,  to  apply  V&V  techniques 
to  the  preprocessor  itself,  to  develop  auto¬ 
mated  verification  procedures  for  input  data 
(for  the  preprocessor  and  hence  for  the  force 
level  model),  and  thereby  to  reduce  the  time 
to  prepare  input  data  for  new  scenarios. 


in-2 


The  contracted  V&V  effort  resulted  in  the 
decision  by  the  Go''emment  to  completely 
re-code  the  preprocessor  in  order  to  remove 
inefficiencies  and  add  automated  data  verifi¬ 
cation  features. 

3.1  LOGICAL  VERIFICATION 

3.1.1  Method  Description 

Logical  Verification  is  the  process  of 
ensuring,  at  each  stage  of  development,  that 
all  assumptions  and  algorithms  are  consistent 
with  the  model  concept.  It  should  be  per¬ 
formed  by  both  the  developer  and  an  inde¬ 
pendent  V&V  agent. 

3.1.2  Approach 

Logical  verification  involves  review 
of  documents  prepared  during  development 
to  ensure  consistency  of  assumptions  and 
algorithms  with  what  is  intended  per  the 
model  concept.  Algorithms  are  examined. 
Variables  treated  explicitly  are  identified,  as 
are  potentially  important  but  excluded  vari¬ 
ables.  Values  of  constants  are  checked. 
Assumptions  that  must  hold  for  the  algo¬ 
rithms  to  apply  are  identified.  Ideally, 
logical  verification  is  performed  in  parallel 
with  model  development,  thereby  allowing 
for  early  identification  and  correction  of 
inconsistencies.  If  performed  after  develop¬ 
ment  or  even  after  fielding,  it  can  still  be 
effective,  although  possibly  more  costly  in 
time  and  analytical  resources.  The  tools  and 
techniques  for  logical  verification  are  de¬ 
scribed  below. 

Documentation  review.  At  whatever 
stage  of  model  development  logical  verifica¬ 
tion  is  first  applied,  the  collection  and  re¬ 
view  of  existing  model  documentation  is  the 
first  step.  A  documentation  review  must  be 
hierarchically  ordered;  that  is,  it  begins  with 
initial  high-level  statements  of  functional 
requirements  for  the  model  and  proceeds 


down  through  model  design  specifications. 
(Code  verification  takes  over  beyond  the 
model  design  documents  and  is  addressed  in 
paragraph  3.2.)  At  each  level,  the  docu¬ 
ment  under  review  is  compared  for  logical 
consistency  with  its  predecessors.  "Logical 
consistency"  does  not  mean  absolute  correct¬ 
ness.  At  each  level,  the  developer  will  have 
a  range  of  options  with  which  to  implement 
a  required  feature.  None  of  the  options  may 
be  "absolutely  correct,"  particularly  when 
the  phenomenon  or  event  is  not  fully  under¬ 
stood.  Some  approaches  may,  however,  be 
demonstrably  incorrect  from  a  technical 
perspective,  such  as  using  an  algorithm  that 
fails  to  implement  the  designer's  stated 
intent.  Most  options,  however,  will  be  at 
least  consistent  with  the  intent  (explicit  or 
implicit)  specified  in  predecessor  documents. 
Note  that  the  same  considerations  apply 
within  any  one  document  as  well;  here  the 
issue  is  internal  consistency. 

Requirements  accounting.  This  is  a 
process  that  traces  requirements  from  their 
earliest  written  fonn  (in,  for  example,  a 
functional  requirements  document)  through 
all  design  documents  to  implementation  in 
code.  The  intent  is  to  ensure  that  all  re¬ 
quirements  have  been  accounted  for.  Each 
original  requirement  must  be  linked  in  "tree 
fashion"  to  one  or  more  functions  or  fea¬ 
tures  at  the  next  development  step.  Con¬ 
versely,  each  function  or  feature  must  be 
traceable  back  to  a  requirement.  A  failure 
in  either  direction  is  a  requirements  account¬ 
ing  discrepancy.  This  technique  is  often 
applied  in  an  independent  V&V  process 
accompanying  development  of  a  new  model 
under  contract  or  major  modification  to  an 
existing  model  via  contract. 

Design  walk-through.  This  involves 
the  design  team  discussing  each  aspect  of 
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the  design  with  a  group  of  functional  experts 
in  an  interactive  session.  While  it  is  a 
method  of  uncovering  flaws  in  the  design,  it 
may  also  uncover  flaws  in  the  statement  of 
requirements  or  in  the  specifications.  Ideal¬ 
ly,  .some  form  of  requirements  accounting 
will  have  been  applied  prior  to  the 
walk-through  in  order  to  provide  both  a 
stnicture  to  the  walk-through  and  complete 
coverage  of  the  issues.  In  any  event,  the 
design  team  may  use  any  of  a  variety  of 
systems  engineering  diagramming  techniques 
(e.g.,  data  flow  diagrams)  and  other  visual 
aids  (e.g.,  symbolic  and  iconic  models)  in 
interactive  briefings  to  explain  each  feature 
of  the  design  (i.e.,  what  requirement  it 
responds  to,  how,  and  why).  When  other 
design  options  might  be  both  obvious  and 
attractive,  the  reasons  for  adopting  the 
selected  option  should  be  stated.  Often,  the 
"why”  of  a  choice  is  the  most  important 
information.  First,  it  is  the  one  most  likely 
to  uncover  a  mismatch  between  expectations 
and  design.  Second,  it  is  the  one  most 
likely  to  uncover  a  mismatch  between  expec¬ 
tations  and  stated  requirements.  Conse¬ 
quently,  to  be  effective,  a  design 
walk-through  should  be  challenging  but  not 
adversarial.  Both  the  design  team  and  the 
review  team  must  be  ready  to  fully  explain 
what  they  have  documented  and  stated. 

Flow  diagrams.  Flow  diagramming 
(also  called  data  flow  diagramming,  struc¬ 
tured  analysis,  or  occasionally  process 
diagramming)  is  a  technique  that  approaches 
any  system  or  process  from  the  perspective 
of  the  data  being  used  or  manipulated.  It  is 
particularly  powerful  and  robust  as  a  vehicle 
of  communication  between  systems  design¬ 
ers  and  functional  users.  In  essence,  flow 
diagramming  uses  a  very  simple  set  of 
symbols  to  record  how  data  —  in  whatever 
fonn  -  moves  through  and  is  transfonned 


by  a  process.  Each  level  of  flow  diagram  is 
itself  analyzed  using  the  same  technique 
until  either  no  further  information  is  to,  be 
gained  by  going  to  a  .still  lower  level  or  the 
current  level  is  sufficient  for  the  purpose  at 
hand  (e.g.,  briefing  an  existing  design  as 
opposed  to  developing  a  new  design).  At  a 
lower  level  —  reflecting  the  planned  or 
actual  implementation  of  a  computerized 
system  —  flow  diagramming  can  also  in¬ 
clude  flow  charting,  which  is  a  technique 
that  uses  standard  symbols  to  represent  data 
flows,  system  logic,  and  physical  data  pro¬ 
cessing  entities.  In  comparison  to  data  flow 
diagramming,  flow  charting  places  far  great¬ 
er  emphasis  on  the  hardware  and  software 
mechanics  of  a  system  and  thus  is  most 
useful  in  support  of  the  model  design  for  a 
system  or  in  troubleshooting  a  fielded  sys¬ 
tem. 

Algorithm  checks.  This  involves 
rigorous  verification  of  the  mathematics  of 
an  algorithm  to  ensure  freedom  from  any 
errors  in  the  expres-sions  (e.g. ,  incorrect 
signs,  incorrect  variables  applied  in  equa¬ 
tions,  derivation  errors)  and  to  ensure  that 
the  algorithms  are  consistent  with  their 
stated  intents.  Algorithm  checks  are  usually 
a  part  of  a  document  review  effort,  but  also 
may  be  perfonned  without  a  more  general 
review.  In  either  case,  they  must  be  per¬ 
fonned  at  both  the  design  level  and  (if  they 
pass  that  test)  again  at  the  pseudo-code 
level.  The  dual  check  is  necessary  because 
the  mathematical  expressions  themselves 
change,  and  are  subject  to  human  error, 
when  transfonned  from  symbolic  fonn  in 
design  documents  into  pseudo-code  fonn. 

Computer-Assisted  Systems  Ent;i- 
neering  (CASE!  tools.  These  are  compiled 
application  programs  that  can  be  used  to 
analyze  the  source  code  of  other  programs. 
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These  tools  pjjvide  measures  of  program 
correctness  and  design  efficiency,  and  can 
Iv  used  to  assist  in  converting  logical  or 
conceptual  process  de'^criptions  Into  a 
computer-based  methodology.  Included  here 
are: 

•  Structured  analysis  tools 

•  System  requirements  analysis  tools 

•  Flow  charting  tools 

•  Network  analysis  tools 

•  Perfonnance  models. 

While  the  first  three  of  these  have 
obvious  places  in  logical  verification,  the 
remaining  two  also  have  legitimate  roles. 
For  example,  the  system  specifications  for 
an  interactive  model,  particularly  a 
networked  war  game,  may  .state  the  peak 
projected  data  communication  loads,  maxi¬ 
mum  acceptable  error  rates,  etc.  Perfor¬ 
mance  models  and  network  analysis  tools 
can  then  be  used  to  verify  that  the  projected 
performance  of  a  proposed  design  can  meet 
stated  requirements. 

Sensitivity  analysis.  As  a  validation 
technique,  sensitivity  analysis  involves 
executing  the  model  with  systematically 
varied  input  parameters  to  ensure  that  the 
model  behaves  as  would  be  dictated  by  the 
real  world.  As  a  verification  technique  (and 
specifically  as  a  logical  verification  tech¬ 
nique),  sensitivity  analysis  again  involves 
executing  the  model  with  varied  input  pa¬ 
rameters;  however,  here  the  purpose  is  to 
ensure  that  the  model  behaves  as  dictate  by 
the  model  concept  and  satisfies  the  intent  of 
the  functional  requirements.  By  implication, 
sensitivity  analysis  ensures  that  the  model  is 


properly  "sensitive"  to  those  factors  that  are 
essential  to  the  functional  requirements. 
Sensitivity  analyses  are  necessary  due  to  the 
complexity  of  many  models.  The  problem 
is  that  it  is  frequently  impossible  to  express 
the  input-output  relationships  of  a  model  in 
a  single  equation  or  set  of  simultaneous 
equations.  Instead,  most  models  and  simu¬ 
lations  achieve  their  solutions  in  a  step-wise 
fashion.  At  each  step,  an  intermediate 
outcome  is  determined  from  the  set  of  input 
parameters  to  that  step  and  in  turn  becomes 
an  input  parameter  to  the  next  step.  Fur¬ 
thermore,  an  intermediate  outcome  may 
determine  which  step  is  next.  Such  situa¬ 
tions  give  rise  to  complex  multi-dimensional 
outcome  distributions.  Sensitivity  analysis 
attempts  to  provide  the  reviewer  with  an 
understanding  of  such  an  outcome  distribu¬ 
tion  and  its  relationship  to  a  particular  range 
of  input  values,  without  necessitating  a  full 
understanding  of  the  internal  complexities  of 
the  model. 

Determining  directly  whether  a 
model  behaves  as  intended  is  frequently 
impossible,  simply  because  there  is  no 
"intended"  or  expected  outcome  distribution 
to  use  for  comparison.  Instead,  sensitivity 
analysis  should  be  applied  in  a  four-step 
process.  First  of  all,  the  model  is  broken 
into  components  for  which  the  outcome 
distributions  are  known,  potential!)'  breaking 
it  down  to  the  level  of  the  modules  corre¬ 
sponding  to  the  individual  algorithms. 
Second,  each  such  component  is  analyzed  to 
detennine  which  input  parameter:;  should  be 
varied,  in  what  combinations,  and  over  what 
ranges,  to  test  the  model-generated  outcome 
distribution  adequately.  Third,  the  -actual 
tests  are  performed  using  either  direct 
one-on-one  comparisons  for  deterministic 
components  or  statistical  hypothesis  testing 
for  stochastic  components.  If  the  model 
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passes  those  tests,  the  fourth  step  is  to 
design  and  conduct  a  sensitivity  analysis 
experiment  that  exercises  the  complete 
model  over  ranges  of  input  parameter  values 
and  examines  the  resulting  outcomes  for 
logical  consistency  with  their  respective 
input  values  and  with  each  other.  Often, 
only  the  fourth  of  these  steps  is  applied,  and 
occasionally  little  planning  goes  into  its 
design.  Unfortunately,  that  fourth  step  by 
itself  is  an  extremely  weak  form  of  verifica¬ 
tion.  The  basic  problem  is  still  that  the 
fonn,  shape,  and  parameters  of  the  outcome 
distribution  of  the  complete  logical  model 
remain  unknown.  When  only  the  fourth 
step  is  performed,  however,  it  becomes  the 
verification  analogue  of  face  validation,  i.e., 
the  best  that  the  verification  agent  can  say  is 
"For  the  specific  set(s)  of  input  parameter 
values  that  were  used,  nothing  was  seen  in 
the  output  values  that  would  discredit  model 
results. "  If  that  statement  can  be  combined 
with  additional  statements  regarding  how  the 
-  experiment  was  designed  to  ensure  that 
critical  relationships  were  identified  and 
adequately  tested,  confidence  should  in¬ 
crease  but  would  still  fall  short  of  what 
could  be  achieved  by  applying  all  four  steps. 

Note  also  that  sensitivity  analysis  for 
the  purpose  of  logical  verification  should 
only  be  performed  after  code  verification, 
because  even  a  relatively  minor  code  imple¬ 
mentation  problem  or  error  could  invalidate 
the  findings  of  any  logical  verification  test 
usmg  that  code. 

Reverse  engineering.  This  is  a 
model  assessment  methodology  that  has 
application  to  both  the  logical  verification 
and  the  logical  validation  methods.  It  is 
based  on  the  fact  that  the  capabilities,  accu¬ 
racy,  and  validity  of  a  fully  computerized 
model  can  be  no  better  than  those  of  the 


underlying  logical  model.  Implementation 
considerations  and  techniques  can  at  best 
preserve  the  attributes  of  the  logical  model, 
but  may  degrade  them.  Thus  an  assessment 
based  on  the  logical  model  can  provide  an 
estimate  of  the  high  end  of  a  model’s  attrib¬ 
utes. 

A  model's  analytical  capabilities  and 
technical  validity  attributes  are  determined 
by  its  logic  and  control  structures  and  their 
underlying  assumptions,  its  computational 
algorithms  and  underlying  mathematical 
assumptions,  and  its  data  manipulation  and 
transformation  algorithms.  The 
reverse-engineering  approach  breaks  the 
logical  model  into  its  component  algorithms 
and  logic  constructs  and  then  derives  those 
same  algorithms  and  develops  those  same 
constructs  from  a  zero  base.  As  the  deriva¬ 
tions  and  developments  proceed,  each  as¬ 
sumption  necessary  to  each  step  of  the 
process  is  identified  and  recorded. 

For  reverse  engineering  applied  in 
logical  verification,  the  assumptions  are 
examined  for  logical  consistency  among 
themselves  and  with  the  requirements  and 
precepts  of  the  model  concept.  Inconsisten¬ 
cies  are  noted  and  analyzed  for  their  impli¬ 
cations  vis-a-vis  the  intended  application  of 
the  model.  Ultimately,  the  user  must  be 
asked  to  decide  whether  those  implications 
are  severe  enough  to  require  adopting  alter¬ 
native  assumptions  and  thus  revising  the 
model. 

A  shortened  form  of  reverse  engi¬ 
neering  attempts  to  identify  and  analyze  only 
a  few  "most  critical"  algorithms  and  logic 
constructs  in  the  model.  It  further  attempts 
to  start  at  some  level  above  the  zero  base. 
Doing  so,  however,  presumes  the  presence 
(in  the  subject  algorithms  and  constructs)  of 
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building  block  terms  or  algorithms  that  have 
received  rigorous  and  documented 
zero-based  assessments  in  the  past.  For  the 
shortened  form  to  be  effective,  model  docu¬ 
mentation  must  be  comprehensive,  particu¬ 
larly  regarding  algorithms  and  logic  con¬ 
structs. 

3.1.3  Sample  Application 

Regarding  V&V  of  the  preprocessor 
introduced  in  paragraph  3.0.3,  the  first  step 
was  to  review  requirements  to  determine  the 
logic  needed.  This  review  was  accom¬ 
plished  through  several  methods,  including: 
review  of  design  documents,  analysts’  manu¬ 
als,  and  programmer  comments  within  the 
source  code;  interviews  with  users  and 
subject  area  experts;  and  (in  limited  cases) 
review  of  DoD  policies  regarding  doctrine, 
procedures,  and  reporting.  Following  a 
thorough  and  extensive  requirements  review, 
a  new  detailed  system  design  plan  was 
produced  that  thoroughly  captured  evei^ 
portion  of  the  logic,  data  manipulations,  and 
algorithms  to  be  includerl  in  the  preprocess¬ 
or.  This  document  was  reviewed  by  users, 
analysts,  data  source  specialists,  and  subject 
area  experts  as  appropriate.  The  design 
document  was  subsequently  used  as  the  basis 
for  re-coding  the  preprocessor. 

3.2  CODE  VERIFICATION 

3.2.1  Method  Description 

Code  verification  is  a  rigorous  audit 
of  all  code  (pseudo-code  and/or  compilable 
code)  to  ensure  prope''  implementation  of 
the  model  design. 

3.2.2  Approach 

Listed  here  are  code  verification 
techniques.  Explanations  are  provided 
where  meanings  are  not  evident  and  have 
not  been  provided  earlier  in  this  chapter. 


•  Documentation  review. 

•  Code  walk-through. 

•  CASE  tools. 

•  Automated  test  tools.  A  number  of 
computerized  tools  exist  for  generat¬ 
ing  test  cases,  test  data,  and 
coverage  measures  for  employed 
tests. 

•  Peer  review. 

•  Sensitivity  analysis. 

•  Requirements  accounting. 

3.2.3  Sample  Application 

For  the  preprocessor  case  study 
introduced  above,  code  verification  was 
applied  to  the  original  preprocessor  and  to 
the  re-coded  preprocessor.  For  the  original 
preprocessor,  code  verification  was  per¬ 
formed  to  identify  correctly  and  incorrectly 
implemented  steps.  For  the  re-coded  pre¬ 
processor,  code  verification  was  perform^ 
to  ensure  that  all  design  functions  were 
correctly  implemented.  For  both  the  origi¬ 
nal  and  the  re-coded  preprocessor,  code 
verification  involved  thorough  exercising  of 
all  functions  of  the  preprocessor  to  test 
input-to-output  relationships  and  processing. 

The  primary  tools  used  were  comput¬ 
er  compiler  and  system  environment  tools. 
The  primary  technique  applied  was  code 
walk-through.  This  involved  individual 
team  members  stepping  through  other  pro¬ 
grammers’  code  to  ensure  "sanity,"  adher¬ 
ence  to  established  coding  conventions, 
appropriate  commenting,  proper  file  input 
and  output  control,  and  correct  variable 
naming  and  usage.  All  computer  source 
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code  and  object  code  were  subject  to  rigor¬ 
ous  test  and  review  by  the  contract  develop¬ 
ment  team,  as  well  as  by  the  end  user  orga¬ 
nization.  A  test  plan  and  associated  docu¬ 
mentation  (such  as  program  specifications 
and  maintenance  manual)  were  also  prepared 
and  delivered  with  the  new  computer  code. 

3.3  DATA  VERIFICATION 

3.3.1  Method  Description 

Data  verification  is  the  process  of 
ensuring  that  source  data  that  are  to  be  used 
in  the  model  are  converted  correctly  to 
model  input  data  and  are  consistent  with  the 
concept  and  logical  design  of  the  model. 

3.3.2  Approach 

Listed  here  are  techniques  for  data 
verification. 

•  Documentation  review. 

•  Checks  of  range  and  dimension  of 
data, 

•  Plots  of  data. 

3.3.3  Sample  Application 

Returning  to  the  preprocessor  case 
study,  all  input  data  had  to  be  verified  as 
correct  from  external  sources.  This  step 
was  accomplished  as  a  cooperative  effort  of 
the  contractor  performing  V&V  and  the 
DoD  user  organization.  The  latter  has 
corporate  knowledge  of  the  sources  and 
proper  format  of  the  raw  data.  All  data 
computations  stated  in  the  design  plan  were 
also  verified  to  ensure  consistency  and 
proper  manipulation.  This  was  especially 
true  for  data  on  various  classes  of  supply 
that  are  to  be  distributed  across  theater 
regions  and  time  periods  in  accordance  with 
military  populations. 


Data  formats  are  particularly  impor¬ 
tant  since  the  force  level  model  itself  applies 
strict  conventions  in  its  input  read  state¬ 
ments;  data  outside  proper  ranges  can  cause 
immediate  read  errors  or,  worse,  model 
execution  with  embedded  errors.  To  ensure 
proper  format  for  preprocessor  output  data 
(force  level  model  input  data),  rigid  check¬ 
ing  procedures  were '  included  in  the 
re-coded  preprocessor.  Unacceptable  output 
data  (e.g.,  values  outside  specified  ranges) 
would  generate  appropriate  messages.  The 
re-coded  preprocessor  was  then  subjected  to 
vigorous  sensitivity  testing  to  guarantee  that 
erroneous  output  data  formats  could  not 
occur  without  warning,  and  thereby  to 
ensure  that  data  values  outside  acceptable 
ranges  could  not  be  entered  into  the  force 
level  model. 

Additional  features  (appropriate  to 
code  verification  or  data  verification)  of  the 
re-coded  preprocessor  included  two  repons; 
an  audit  trail  report,  and  a  report  generator. 
The  audit  trail  report  is  generated  automati¬ 
cally  by  the  preprocessor  during  each  major 
processing  step;  and  includes  infonnation  on 
the  number  of  data  records  processed  from 
various  input  files,  identification  of  records 
containing  out-of-range  values,  and  cross 
data  file  validity  checks  of  unit  hierarchy. 
This  report  provides  statistics  on  program 
execution,  as  well  as  automated  verification 
checks.  The  report  gene'^tor  allows  users 
and  analysts  to  queiy  intermediate  or  final 
data  bases  for  conditions  or  quantities.  This 
permits  further  data  verification.  Unit 
organization  structures,  dependencies,  and 
equipment  holdings  can  also  be  reported  for 
checking  against  the  scenario  description. 
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3.4  SPECIFIC  LOGIC  AND/OR  AS¬ 
SUMPTION  COMPARISON 

3.4.1  Method  Description 

This  is  a  process  of  ensuring  that  the 
logic  design,  implicit  assumptions,  and 
explicit  assumptions  are  consistent  with  a 
specific  type  of  application.  The  process 
identifies  strengths  and  weaknesses  of  the 
model  for  the  specific  application.  The 
process  is  applied  when  a  model  is  proposed 
for  a  specific  type  of  application  for  which 
accreditation  has  not  previously  been  grant¬ 
ed.  (Refer  to  Chapter  5  for  further  discus¬ 
sion  on  the  topic  of  accreditation.) 

3.4.2  Approach 

The  techniques  for  logical  verifica¬ 
tion,  code^verification,  and  data  verification 
listed  in  previous  sections  apply  here  as 
well.  The  most  likely  techniques  to  be  used 
are  the  following: 

•  Documentation  review. 

•  ,  Algorithm  checks. 

•  Sensitivity  analysis. 

•  Peer  review. 

3.4.3  Sample  Applications 

One  application  of  logic/assumption 
comparison  is  the  user  survey  perform^  by 
the  Warrior  Preparation  Center  (WPC). 
After  a  war  game  exercise  supported  by 
WPC’s  family  of  models,  WPC  solicits 
feedback  from  participants  on  the  utility  of 
the  exercise.  Comments  on  the  fidelity  of 

the  game  and  on  its  adequacy  for  desired 
training  purposes  are  possible. 

3.5  SUMMARY 
3.5.1  Utility 

Verification  ensures  that  the  imple¬ 


mentation  of  the  model  accurately  represents 
the  developer’s  description  and  specifica¬ 
tions.  Short  of  exercising  the  model,  verifi¬ 
cation  provides  the  foundation  to  ensure  that 
the  model  meets  user  needs.  For  this  rea¬ 
son,  verification  can  provide  the  basis  for  an 
initial  accreditation.  Returning  one  last  time 
to  the  preprocessor  case  study,  re-coding 
provided  increased  credibility  of  preprocess¬ 
or  data  sources,  processing,  and  output 
formats;  and  thereby  increased  the'credibili- 
ty  of  input  data  for  the  force  level  model 
itself.  This,  in  turn,  allowed  for  better 
identification  of  problem  areas  in  the  force 
level  model.  Thus,  V&V  applied  to  the 
preprocessor  resulted  in  improved  capability 
to  perform  V&V  on  the  force  level  model. 

3.5.2  Strengths,  Limitations,  and  Lessons 
Learned 

The  advantage  that  verification  meth¬ 
ods  have  over  most  of  those  of  validation  is 
that  they  do  not  require  data  from  the  "real 
world"  or  from  other  models/siiiiulations. . 
Thus  even  when  no  comparative  data  are 
available,  verification  can  increase  the  credi¬ 
bility  of  a  model.  Verification,  however, 
can  never  be  a  substitute  for  validation.  On 
the  other  hand,  a  comprehensive  validation 
cannot  substitute  for  verification.  For  in¬ 
stance,  a  model  might  produce  predictions 
consistent  with  real  world  data,  but  logical 
verification  might  show  that  the  model  does 
not  respond  adequately  to  the  original  re¬ 
quirements.  As  the  size  of  a  model  grows, 
line-by-line  verification  of  code  becomes 
less  practical,  and  validation  techniques 
become  more  essential. 

The  major  limitation  that  applies  to 
several  of  the  logical  verification  techniques 
discussed  above  is  that  applying  them  may 
be  as  much  art  as  science.  Tliis  is  especial¬ 
ly  true  of  reverse  engineering,  design 
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walk-throughs,  and  documentation  reviews. 
Furthermore,  reverse  engineering,  sensitivity 
analysis,  and  algorithm  checks  require 
considerable  expertise  in  mathematics, 
including  areas  such  as  design  of  experi¬ 
ments  and  statistical  hypothesis  testing. 

To  facilitate  logical  verification, 
design  documentation  should  include  de¬ 
tailed  descriptions  of  the  algorithms  and 
flow  charts  showing  how  input  variables  are 
transformed  (through  intermediate  variables) 
..ito  output  variables.  In  addition,  the  docu¬ 
mentation  should  describe  the  model  con¬ 
cept,  the  major  processes  in  the  model,  and 
how  those  processes  interact. 

Significantly,  documentation  is  often 
lacking,  inadequate,  or  dated.  Models  are 
continually  updated  to  correct  errors,  im¬ 
prove  performance,  or  add  new  capabilities. 
Documentation  lags.  Where  documentation 
is  inadequate,  reviewers  may  be  forced  to 
resort  to  code  verification  techniques  --  a 
daunting  task  for  a  large  model.  Standards 
for  documentation  must  be  included  under 
configuration  management  to  ensure  that 
logical  verification  can  be  performed. 

Where  an  independent  V&V  agent 
performs  verification,  effective  dialogue 
must  be  maintained  between  that  agent  and 
the  model  developers.  Questions  arising 
from  the  documentation  (or  its  absence)  can 
frequently  be  answered  by  the  developers. 
The  exchange  of  information  and  ideas 
assists  the  reviewers  in  correctly  interpreting 
the  documentation  and  understanding  the 
model.  Draft  review  documentation  should 


be  provided  to  the  developers  to  permit 
correction  of  factual  errors  prior  to  final 
publication  and  presentation  of  findings. 

For  a  time-stepped  model,  the  review 
should  examine  the  choice  of  the  time  step, 
which  can  be  important  for  fidelity  and  run 
time.  Choosing  too  large  a  step  could 
render  representation  of  some  essential 
activities  impossible.  For  example,  for  a 
combat  model  that  represents  air-to-air 
engagements,  a  one-minute  time  step  may  be 
too  large,  since  air  targets  may  enter  and 
leave  the  launch-acceptability  region  within 
that  time  step.  On  the  other  hand,  choosing 
too  small  a  time  step  could  elevate  run  time 
to  an  unacceptable  level;  e.g.,  for  a  training 
simulation,  run  times  must  generally  be 
maintained  at  (or  faster  than)  real  time. 

For  an  event-stepped  model,  the 
review  should  examine  logical  flow  to  en¬ 
sure  that  event  interactions  are  properly 
considered.  Sometimes  a  particular  event  is 
initiated  and  scheduled  for  later  completion 
regardless  of  other  events  that  could  inter¬ 
vene  and  affect  its  completion. 

3.5.3  Life  Cycle  Management 

Ver  fication  (indeed,  V«tV  in  gener¬ 
al)  should  be  seen  as  a  continual  process  that 
parallels  development  and  enhancement  of  a 
model.  Generally,  a  model  is  developed, 
adjusted,  and  expanded  over  its  life.  At 
appropriate  times,  verification  should  be 
applied  to  ~ure  credibility  of  the  model  for 
its  original  or  newly  intended  applications. 
Clearly,  V&V  must  be  included  in  overall 
configuration  management,  as  discussed 
previously  in  Chapter  II . 
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CHAPTER  IV 


PART  A  -  VALIDATING  MODELS  AND  SIMULATIONS 

by  Donald  Giadrosich 


VALIDATION  -  The  process  of  deter¬ 
mining  the  degree  to  which  a  model 
is  an  accurate  representation  of  the 
real  world  from  the  perspective  of 
the  intended  uses  of  the  model. 


4-A.O  INTRODUCTION 
This  chapter  provides  broad  guidance  for 
validating  models  and  simulations,  including 
helpful  information  for  developing  a  detailed 
model  validation  plan,  conducting  an  appro¬ 
priate  model  validation,  and  communicating 
the  results  of  such  activities  to  officials 
responsible  for  model  accreditation.  Be¬ 
cause  model  uses  cover  a  wide  range  of 
purposes,  complexities,  and  activities,  the 
degree  to  which  validation  can  be  achieved 
in  a  practical  manner  for  each  model  u^ 
will  vary.  Moreover,  since  many  models 
are  improved  as  they  are  used,  validation  of 
a  model  should  be  a  continuous  process, 
conducted  throughout  its  life  cycle. 

Model  validation  can  be  distinguished 
from  model  verification  in  that  verification 
compares  the  model  against  its  design  speci¬ 
fications,  whereas  validation  compares  the 
model  against  the  real  world.  ITie  term 
"real  world"  is  used  herein  to  characterize 
actual  objects  or  situations,  or  our  best 
representation  of  them.  Model  validation 
can  be  distinguished  from  model  accredita¬ 
tion  in  that  validation  is  a  comparison  pro¬ 
cess  whereas  accreditation  is  a  decision  to 
use  a  model  based  on  some  level  of  verifica¬ 
tion  and  validation. 


The  Military  Operations  Research 
Society  (MORS)  definition  of  model  valida¬ 
tion  shown  above  incorporates  several  oper¬ 
ative  words  which  are  extremely  important. 
First,  validation  is  a  process;  this  implies  it 
must  be  systematic,  traceable,  and  describ- 
able,  and  the  results  repeatable.  Second,  it 
establishes  the  degree  to  which  a  model  is 
an  accurate  representation  of  the  real  world. 
If  the  specifications  to  which  a  model  has 
been  developed  accurately  reflect  the  real 
world,  the  processes  of  verification  and 
validation  should  essentially  yield  the  same 
results.  However,  there  may  be  important 
differences  between  the  model  and  the  real 
world  -  some  intentional  and  some  not 
intentional.  The  validation  process  fonnally 
identifies  and  establishes  the  degree  of  the 
important  differences.  Finally,  validation  is 
accomplished  from  the  perspective  of  tlie 
intended  uses  of  the  model. 

Embedded  in  the  validation  process  is 
the  implied  responsibility  to  identify  and 
document  both  the  proper  use  and  the  poten¬ 
tial  misuse  of  a  model.  Ultimately,  for  each 
model  application,  validation  is  accom¬ 
plished  and  docum^ted  for  the  specific 
classes  of  objects  (e.g.,  scenario(s),  mis- 
sion(s),  weapon  systems,  etc.),  specific 
levels  of  investigation  (e.g.,  end  game, 
platform  performance,  campaign,  etc.), 
specific  inputs  and  conditions  (e.g. ,  parame¬ 
ters,  data  bases,  etc.),  and  the  specific 
outputs  of  interest.  In  military  modeling, 
the  outputs  of  interest  derived  from  the 
models  are  often  described  in  tenns  of 
measures  of  effectiveness  (MOEs)  and 
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measures  of  performance  (MOPs).‘ 

Because  of  the  broad,  all-encompassing 
definition  of  models  and  simulations,  the 
specifics  of  each  model  validation  effort 
must  be  tailored  to  the  given  problem  the 
model  is  being  used  to  solve,  technical 
situation,  or  operational  application.  For 
example,  if  a  model  is  being  used  to  address 
the  effects  of  flare  intensities  and  flare  drop 
patterns  on  the  tracking  and  guidance  of  a 
given  missile,  it  would  likely  be  required  by 
the  accreditor  that  a  high  level  of  engineer¬ 
ing  validation  of  the  flares  and  the  missile 
tracking  capabilities  be  accomplished  for  the 
model.  On  the  other  hand,  a  model  might 
be  accredited  to  investigate  the  probable 
damage  that  could  be  inflicted  against  a 
military  air  base  by  multiple  attacking  air¬ 
craft  even  though  it  has  minimal  detailed 
engineering  fidelity  regarding  the  specific 
effects  of  flares.  This  could  occur  if  the 
known  effects  of  the  flares  as  estimated  by 
physical  testing  (or  a  more  detailed  engi¬ 
neering  model)  were  available  and  could  be 
properly  input  to  the  air  base  attack  model. 
Consequently,  model  validation  can  be 
limited  in  scope.  Although  tailoring  for  the 
specifics  of  the  problem  is  required,  the 
basic  comparative  framework  put  forth  in 
this  chapter  is  generally  applicable  for  all 
types  of  models. 

Model  inputs,  outputs,  and  internal 
functions  all  vary  in  the  degree  of  accuracy 
with  which  they  represent  or  describe  the 
known  or  agreed  upon  state  of  nature.  Tlie 
model  validation  process  systematically 
identifies  and  documents  this  degree  of 
accuracy.  Since  validation  is  a  comparison, 
a  model  can  be  considered  sufficiently  valid, 
i.e.,  "good  enough,"  if  the  results  of  the 
comparison  (including  strengths,  limitations, 
and  assumptions)  are  acceptable  in  the  eyes 


of  the  accreditor.  (For  further  discussion  of 
Accreditation,  see  Chapter  5.) 

4-A.l  THE  MODEL  CONCEPT 

The  primary  purpose  of  a  mcxlel  or 
simulation  is  to  provide  a  representation  of 
a  system  or  relationships  which  can  then  be 
used  to  investigate  the  basic  functions  of  that 
system  or  relationships.  A  model,  in  a 
broad  sense,  allows  the  abstraction  of  a 
study  of  a  problem  in  such  a  way  that  (1) 
the  fundamental  processes  of  the  problem 
and  their  influences  and  relationships  can 
be  better  understood.  (2)  predictions  or 
extrapolations  from  the  outcomes  of  current 
problem  conditions  to  potential  outcomes  of 
future  problem  conditions  can  be  made,  and 
(3)  relative  comparisons  of  alternative  sys¬ 
tems  or  solutions  in  meeting  stated  goals  and 
objectives  can  be  made. 

Modem  modeling  has  been  extended 
to  the  examination  of  extremely  complicated 
problems  ranging  from  military  war  games 
to  vastly  complex  systems  like  those  pro¬ 
posed  for  ballistic  missile  defense.  Model¬ 
ing  has  been  defined  as  encompassing 
"...the  development  of  axiomatic  systems, 
the  formulation  of  social  theories,  the  deri¬ 
vation  of  physical  first  principles,  and  the 
drafting  of  laws.  It  is  thus  an  art  natural  to 
mankind,  and  focusing  this  art  on  the  do¬ 
main  of  military  science  conceptually  en¬ 
compasses  the  principles  of  war,  strategy, 
tactics,  the  laws  of  warfare,  and  the  struc¬ 
ture  of  military  forces."' 

Scientists  and  engineers  employ  models 
as  a  means  of  mathematically  or  logically 
expressing  the  relationships  between  vari¬ 
ables.  Figure  IV-A-1  is  a  simplistic  repre¬ 
sentation  of  this  process. 
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INPUTS 

•  SCENARIOS 

•  DATA 
BASES 

•  ETC. 


MODEL 

OUTPUTS 

•  ASSUMPTIONS 

MOES 

•  PARAMETERS 

•  LOGIC 

MOPS 

FUNCTIONS 

•  CODE 

ETC. 

•  ETC. 

FIGURE  IV-A-1.  Simplistic  Representation  of  the  Modeling  Process 


The  simple  model  depicted  in  Figure 
rv-A-1  can  be  thought  of  as  somewhat 
analogous  to  a  scientific  hypothesis  based  on 
a  priori  knowledge  which  is  accepted  as  cor¬ 
rect  and  from  which  inferences  can  be 
drawn.  The  model  may  contain  axiomatic 
systems,  social  theories,  physical  first  prin¬ 
ciples  and  laws,  and  always  requires  certain 
assumptions.  When  certain  input  data  and 
conditions  (e.g.,  scenarios,  data  bases,  etc.) 
are  provided  to  the  model,  it  operates  on  the 
inputs  to  produce  certain  outputs  that  can  be 
described  in  terms  of  desired  MOEs,  MOPs, 
etc.  The  form  of  the  model  may  be  analog, 
digital,  hybrid,  man-in-the-loop,  hardware- 
in-the-loop,  or  some  other  variant;  and  it 
may  be  deterministic,  probabilistic,  or  a 
combination  of  both. 

A  model  or  simulation  often  takes  on 
a  hierarchical  structure  for  application  to 
very  complex  problems.  This  structure  may 
take  the  form  of  very  detailed  model  which 
addresses  each  of  the  fundamental  processes 
of  a  problem.  The  outputs  from  these  mod¬ 
els  are  used  to  provide  input  to  the  next 
level  of  the  hierarchy  which  may  treat  sever¬ 


al  of  these  fundamental  processes  and  their 
influences  and  relationships  to  each  other. 
The  outputs  from  this  level  of  the  hierarchy 
feeds  the  next  level,  etc.  Each  level  of  the 
hierarchy  addresses  a  larger  problem  but 
usually  in  more  general  and  less  specific 
terms.  This  facilitates  treating  extremely 
complex  problems  in  a  mors  structured, 
rigorous  and  transparent  manner  than  would 
otherwise  be  possible. 

Modelers  and  simulators  sometimes 
describe  their  applications  of  this  hierarchi¬ 
cal  approach  to  modeling  in  terms  of  levels 
of  analysis.  Over  the  past  decade,  some 
analytical  communities  have  adopted  four 
levels  of  model  analysis,  which  are  defined 
in  "A  Methodology  for  the  Test  and  Evalua¬ 
tion  of  Command,  Control,  Communica¬ 
tions,  and  Intelligence  (C3I)  Systems” 
(draft),  published  by  the  Deputy  Director, 
Defense  Research  and  Engineering  (Test  and 
Evaluation),  as  part  of  an  Implementation 
Program  Plan  (IPP)  on  "The  Use  of  Model¬ 
ing  and  Simulation  to  Support  the  Test  and 
Evaluation  (T&E)  of  Command,  Control, 
Communications,  and  Intelligence  (C3I) 
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Systems,"  dated  11  April  1990.  Table  IV- 
A-1  illustrates  these  levels  for  air  combat. 


aggregate  at  higher  orders  of  complexity  and 
conflict. 


A  level  I  model,  for  example,  by  defini¬ 
tion  "examines  the  performance  of  an  indi¬ 
vidual  engineering  subsystem  or  technique  in 
the  presence  of  a  single  threat."  The  level 
I  model  requires  as  input  the  engineering 
parameters  and  characteristics  of  the  subsys¬ 
tem  in  sufficient  detail  to  ascertain  the  actual 


The  relationship  of  a  model  or  simula¬ 
tion  to  a  hierarchy  of  models  and  simula¬ 
tions  is  an  important  consideration  in  valida¬ 
tion.  A  model  could  very  well  be  validated 
for  stand-alone  use  but  not  be  validated  for 
use  in  a  particular  M/S  hierarchy.  Con¬ 
versely,  it  may  be  validated  for  use  in  a 


TABLE  iV-A-1.  Example  Levels  of  Model  Analysis 


LEVEL 

SHORT  NAME 

KEY  PROCESS 

MODELING  OUTPUT 

I 

Engineering 

Electromagnetic 
Signal  Flow 

Electronic  Combat  (EC) 
Performance 

II 

Platform 

Weapon  System 
Engagement  Per¬ 
formance 

Weapon  System  Perfor¬ 
mance 

III 

Mission 

Multi-Weapon 
System  Opera- 
tion.? 

Weapon  System  Effec¬ 
tiveness 

IV 

Theater/Campaign 

Force-On-Force 

Targeting 

Force  Effectiveness 

end  game  effects  of  the  electronic  counter¬ 
measures  (ECM),  chaff,  flares,  and/or 
maneuvers  employed  by  a  single  aircraft  on 
the  performance  of  the  system  or  missile. 
These  effects  can  be  determined  internal  to 
the  model  and  are  not  required  inputs.  By 
definition,  "level  I  outputs  can  be  combined 
and  fed  into  level  II  analyses  to  evaluate  the 
installed,  aggregate  performance  of  a  num¬ 
ber  of  specific  engineering  subsystems  or 
techniques  against  a  specific  threat."  This 
definition  implies  that  the  level  II  model  has 
the  appropriate  logic  to  treat  the  effects  of 
ECM,  chaff,  flares,  and/or  maneuvers  of  the 
individual  systems  and  how  they  combine 
and  behave  within  a  specific  threat  or 
threats.  Levels  III  and  IV  are  defined  to 


particular  hierarchy  and  not  be  validated  for 
certain  stand-alone  usage. 

4-A.2  DECOMPOSITION  OF  A  MODEL 
Models  and  Simula' ‘'^ns,  regardless  of 
the  level  of  complexity,  can  be  ‘bought  of  in 
terms  of  a  number  of  interrelated  component 
parts  which  function  together  to  take  the 
input  information  and  operate  on  it  accord¬ 
ing  to  specific  model  functions.  Model 
functions  encompass  all  things  internal  to  the 
model  (e.g  ,  assumptions,  logic,  algorithms, 
parameters,  coding,  etc.).  During  design, 
an  attempt  is  made  to  structure  the  model 
functions  in  an  optimum  manner  to  produce 
the  desired  outputs.  Decomposition  of  a 
model  can  be  thought  of  as  a  reversal  of  the 
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initial  design  process. 

Model  validation  is  greatly  facilitated  by 
decomposing  models  into  their  component 
parts  and  subparts.  Decomposition  of  a 
model  is  analogous  to  the  application  of 
engineering  systems  theory,  whereby  a  total 
system  is  broken  down  into  a  number  of 
subsystems  described  by  i;ansfer  functions. 
In  fact,  engineering  systems  theory  can  often 
be  employed  in  this  process.  Each  subsys¬ 
tem  is  then  characterized  by  its  own  model 
functions  and  associated  inputs  and  resulting 
outputs.  Full  system  examination  is  facili¬ 
tated  because  a  large  intractable  problem  is 


broken  up  into  a  number  of  smaller  manage¬ 
able  problems. 

The  decomposed  model  can  be  examined 
in  t^nns  of  model  functions  as  depicted  in 
Figure  IV-A-2. 

Decomposition  also  allows  the  mod¬ 
eler  to  view  and  better  understand  the  opera¬ 
tion  of  the  internal  workings  of  the  of  the 
system.  For  example,  our  model  might  be 
the  fly-out  model  of  a  specific  missile.  Our 
first  level  of  decomposition  could  be  exam¬ 
ining  the  model  function  for  the  pitch  steer¬ 
ing  channel  of  the  missile.  Decomposition 


Inputs 


Inputs 

Inputs 

Inputs 


Mooa 


FIGURE  IV-A.2.  Decomposition  of  the  Model 
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might  be  achieved  to  lower  levels  as  depict¬ 
ed  in  Figure  IV-A-3.  Here,  we  take  the 
specific  model  functions  and  break  them 
down  into  components.  The  model  function 
for  missile  pitch  steering  might  be  further 
decomposed  into  seeker  unit,  processor, 
torque  converters,  and  so  forth.  Neither  of 
these  two  graphic  depictions  is  intended  to 
imply  that  certain  parallel  and/or  serial  rela¬ 
tionships  within  the  model  not  be  accounted 
for  (e.g.,  the  adequacy  and/or  completeness 
of  defining  subsystem  interdependency).  In 
fact,  the  components)  and  their  interactions 
within  the  model  must  account  for  subsys¬ 
tem  interdependency  to  achieve  the  appro¬ 
priate  output. 


4-A.3  THE  VALIDATION  PROBLEM 
Validation  has  often  been  used  in  the 
broadest  sense  as  a  measure  of  how  much 
credence  one  should  place  in  the  output  of  a 
given  model.  We  are  using  the  term  more 
narrowly  to  refer  to  the  process  of  checking 
out  the  model  against  real  world  infonna- 
tion.  From  this  perspective,  validation  is 
part  of  determining  the  credibility  of  the 
model  for  a  particular  set  of  uses,  not  the 
totality  of  all  possible  uses.  The  validation 
process  is  designed  to  increase  knowledge 
about  how  well  the  model  represents  reality 
and  to  aid  users  and  decision  makers  in 
determining  whether  the  results  obtained 
from  the  model  sufficiently  represent  what 


inputs 
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FIGURE  IV-A-3.  Decomposition  of  Model  Functions 
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FIGURE  IV-A-4.  Potential  Comparisons  for  Validation 


they  would  observe  if  the  situation  or  entity 
were  actually  played  out  in  the  real  worlds 

There  are  numerous  dimensions  by 
which  one  can  partition  the  compar  sons  that 
can  be  made  in  validating  a  model.  The 
spectrum  of  potential  comparisons  includes 
elements  of  the  inputs,  the  outputs,  and  the 
model  itself.  As  illustrated  in  Figure  IV-A- 
4,  we  have  partitioned  the  outputs  of  the 
model  into  a  domain  called  output  validation 
and  the  model  inputs  along  with  the  model 
internal  functions  into  a  domain  called 
structural  validation.  Validation  of  com¬ 
plex  models  should  include  some  combina¬ 
tion  of  both  structural  dnd  output  validation. 


Output  validation  is  the  most  credible 
form  of  validation  and  should  be  conducted 
at  the  full  model  level  to  the  extent  possible. 

When  it  is  not  possible  to  conduct  output 
validation  for  the  full  model,  the  model  can 
be  decomposed  and  output  validation  accom¬ 
plished  for  parts  of  the  model  to  the  extent 
practical. 

Structural  validation  should  also  be  ac¬ 
complished  for  those  aspects  of  the  model 
critical  to  the  model's  use.  The  planned 
application  of  the  model  should  always  be  a 
key  driver  in  establishing  the  details  of  its 
specific  validation. 
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Output  Validation 

In  ge*  ..ral,  output  validation  usually 
contributes  the  most  convincing  evidence  for 
establishing  the  credibility  of  a  model.  It  is 
the  process  of  determining  the  extent  to 
which  the  output  (outcomes  or  outcome 
distributions  for  the  model  and/or  sub-mod- 
els)  represent  the  significant  and  salient 
features  of  the  real  world  systems,  events, 
and  scenarios  it  is  supposed  to  represent. 
Output  validation  involves  collecting  real 
world  data  and  comparing  them  with  the 
output  of  the  model  (i.e.,  MOEs  and  MOPs 
of  interest)  to  assess  how  well  the  model 
results  reflect  those  of  the  "real  world,"  i  e., 
the  actual  system  or  process  being  modeled. 

In  extremely  complex  and  difficult 
modeling  situations,  the  requirements  for 
comparing  real  world  results  and  model 
results  may  be  difficult,  if  not  impossible,  to 
meet.  This  difficulty  usually  arises  from  the 
inability  to  actually  conduct  a  realistic  exer¬ 
cise  of  the  system  being  modeled  because  of 
certain  constraints  (e.g.,  insufficient  envi¬ 
ronmental  conditions,  resource  constraints, 
safety  considerations,  insufficient  threat 
representation,  inability  to  replicate  the  real 
world  conditions,  etc.).  Even  though  com¬ 
parison  at  the  output  level  of  the  full  model 
may  not  be  possible,  it  is  often  possible  to 
make  comparisons  at  lesser  levels  or  with  a 
scaled-down  version  of  the  system.  Ironi¬ 
cally  enough,  it  is  this  inability  to  replicate 
(or  even  to  understand)  the  real  world  that 
usually  drives  one  to  the  use  of  a  model  in 
the  first  place  Quantitative  approaches  to 
the  comparisons  usually  provide  the  most 
convincing  evidence  about  model  validation. 
Quantitative  output  validation,  however, 
requires  that  the  outputs  of  the  model  be 
observable  and  measurable  in  the  real  world. 

Qualitative  assessments  made  by 
operational  and  technical  subject  matter 
experts  are  important  inputs  to  the  validation 


process.  As  discussed  earlier,  under  certain 
conditions,  quantitative  comparisons  may  be 
prohibitive  or  limited.  Face  validation  by 
experts  and  other  qualitative  methods  (e.g., 
the  use  of  focused-group  interviews  or  a 
modified  Delphi  technique)  for  obtaining 
expert  opinions  on  critical  model  issues 
should  be  applied.  Findings  from  the  quali¬ 
tative  methods  should  be  used  to  supplement 
and  reinforce  the  available  quantitative 
comparisons. 

Model  outputs  are  generally  selected 
based  on  how  well  they  represent  the  mili¬ 
tary  performance  and  utility  of  a  system 
(i.e.,  MOEs  and  MOPs  as  discussed  earli¬ 
er).  MOEs  and  MOPs  represent  different 
sets  of  system  measures  of  interest  from  the 
perspective  of  operators  and  developers, 
respectively.  As  depicted  in  Figure  IV-A.5, 
these  two  sets  are  not  necessarily  mutually 
exclusive.  For  example,  it  is  highly  proba¬ 
ble  and  desirable  that  both  operators  and 
dsvelopeis  have  a  keen  interest  in  some  of 
the  same  measures  for  certain  systems  and 
situations.  Also,  some  form  of  functional 
relationship  normally  exists  between  the 
MOEs  and  MOPs  of  interest,  even  though 
they  may  not  be  well-defined  or  explicit. 

Furthermore,  it  is  extremely  important 
that  observations  and  measurements  made  in 
the  real  world  be  executed  in  such  a  way  that 
they  accurately  represent  the  outputs  of  inter¬ 
est.  This  implies  the  application  of  the  scien¬ 
tific  approach  to  testing  and  experimenting  and 
the  inclusion  of  quantitative  and  qualitative 
statistical  comparisons  where  appropriate. 
Such  comparisons  may  be  made  based  on  data 
points,  intervals,  and  distributions  and  may 
involve  both  absolute  and  relative  values. 
Data  from  testing  should  be  model-compatible 
to  the  extent  possible  so  that  the  model-test- 
model  approach  to  development  can  be  in¬ 
voked. 
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FIGURE  IV-A-5.  Output  MOEs  and  MOPs 


4-A.3.1.  Structural  Validation 

Structural  validation  deals  with  an  a 
prion  examination  of  the  model  input  data, 
the  basic  principles  of  the  model,  and  its 
assumptions  to  determine  the  degree  to 
which  they  are  complete,  logically  consis¬ 
tent,  ana  reasonable  for  the  types  of  uses 
envisioned.  To  a  large  extent,  structural 
validation  is  designed  to  increase  the  knowl¬ 
edge  and  confidence  of  developers,  users, 
customers,  decision  makers,  and  indepen¬ 
dent  reviewers  in  the  model  results  by  dem¬ 
onstrating  that  the  model  has  internal  integ¬ 
rity.  Structural  validation  is  the  process  of 
determining  the  extent  to  which  the  input 
data  (e.g.,  scenario(s),  mission(s),  data 
bases,  etc.)  and  the  model  (e.g.,  assump¬ 
tions,  logic,  algorithms,  parameters,  coding, 
etc.)  represent  the  significant  and  salient 
features  of  real  world  systems,  events,  and 
scenarios.  Model  decomposition  greatly 
aids  in  this  process. 

Structural  validation  may  be  the 


primary  form  of  validation  that  can  be 
accomplished  with  extremely  complex  mod¬ 
els,  especially  when  observations  and  mea¬ 
surements  of  the  outputs  of  interest  are  not 
possible  or  practical  in  the  real  world. 
Structural  v^idation  can  occur  through 
empirical  measurement  and  comparison,  as 
well  as  theoretical  examination  based  on 
physical  first  principles,  logic  and  axiomatic 
systems,  and  other  scientific  laws.  Such 
comparisons  can  include  qualitative  ap¬ 
proaches,  such  as  expert  opinions,  or  statis¬ 
tical  tests  based  on  both  quantitative  and 
qualitative  data.  As  one  proceeds  to  greater 
refinernem  and  depth  in  structural  valida¬ 
tion,  several  dimensions  on  which  to  base 
these  comparisons  may  be  examined.  For 
example,  one  might  pursue  validating  criti¬ 
cal  parts  of  the  model  structure  in  terms  of 
theory,  performance  against  other  accredited 
models,  scaled  down  or  limited  component 
testing,  and  any  available  combat  history 
relating  to  the  specific  portion  of  the  model 
being  addressed. 
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Table  4-A-2.  Examples  of  Information  Sources 


• 

Functional  Experts 

• 

Operational  Experts 

• 

Scientific  Theory  (physics,  engineering,  behavioral,  etc.) 

• 

System  Design  Information 

• 

Laboratory  Measurements  (components,  response  functions,  etc.) 

• 

Special  Test  &  Training  Facility  Measurements  (hardware-in-the-loop,  antenna 
patterns,  radar  cross  section,  infrared,  millimeter  wave,  human  factors,  decision 
making,  etc.) 

• 

System  Level  Measurements 

•  •  Developmental  Test  and  Evaluation 

•  •  Operational  Test  and  Evaluation 

n 

Live  Fire  Test  Measurements 

n 

Combat  History  and  Measurements 

1^ 

Other  Accredited  Models 

4>A.3.2.  Sources  of  Information 

Model  validation  requires  the  use  of 
a  bread  range  of  sources  of  information  (see 
Table  IV-A-2).  This  information  is  both 
qualitative  and  quantitative  and  can  vary 
from  the  opinion  of  functional  experts, 
operational  experts,  and  scientists  to  com¬ 
parisons  based  on  precise  deterministic  and 
probabilistic  measurements.  Scientific 
theory,  system  design  information,  and 
laboratory  measurements  can  provide  essen¬ 
tial  structural  information.  Special  test  and 
training  facility  measurements,  as  well  as 
system  level  measurements  during  develop¬ 
mental  and  operational  test  and  evaluation, 
are  excellent  sources.  Finally,  live  fire  test 
measurements,  historical  information  and 
combat  measurements,  and  data  (which 
include  both  pertinent  information  from 
VV&A  and  relevant  output  data)  from  other 
accredited  models  used  for  similar  applica¬ 
tions  are  potential  sources  of  vali^tion 
information. 


4-A.3.3.  Sensitivity  Analysis 

The  sensitivity  analysis  is  a  critical  part 
of  both  structural  and  output  validation. 
Sensitivity  analysis  is  a  formal  examination 
of  how  output  variables  of  the  model  re¬ 
spond  to  changes  in  inputs,  assumptions, 
parameters,  and  critical  logic  functions. 
Sensitivity  analysis  should  be  conducted  on 
the  total  model  and  on  all  its  decomposed 
parts.  Sensitivity  analysis  can  be  used  to 
check  for  proper  responses  to  input  variables 
and  to  identify  marginal  break  points  and 
special  limiting  values.  It  can  be  used  to 
understand  better  how  the  model  works  and 
to  help  identify  errors  in  the  model  structure 
and/or  code.  However,  sensitivity  analysis 
is  limited  to  telling  you  only  what  the  m^el 
is  sensitive  to;  one  has  to  go  further  with  the 
comparative  validation  process  to  ensure  that 
the  model  sensitivities  are  indeed  representa¬ 
tive  of  the  real  world. 
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A  vital  part  of  sensitivity  analysis  is 
to  help  one  understand  where  the  model 
results  are  extremely  sensitive  to  changes  in 
model  algorithms,  input  data,  parameters, 
and/or  assumptions.  These  sensitivities 
should  be  of  paramount  interest  to  those 
who  have  to  validate  and  accredit  a  model  as 
well  as  those  who  must  rely  heavily  on  the 
model  output  for  important  decisions.  This 
discussion  addresses  the  situation  where  a 
change  in  a  given  part  of  the  model  is  found 
to  not  have  a  significant  or  pronounced 
change  on  the  major  output  of  interest.  For 
example,  it  might  be  that  doubling  the  reli¬ 
ability  of  a  given  subsystem  will  only  slight¬ 
ly  change  the  overall  total  mission  reliability 
because  that  particular  subsystem  does  not 
really  make  a  difference  in  being  able  to 
successfully  complete  the  mission. 

4-A.3.4.  Tasks  for  Model  Validation 

Example  tasks  that  should  be  accom¬ 
plished  and  the  results  documented  during 
model  validation  can  be  discussed  in  terms 
of  preparing  to  conduct  the  validation, 
conducting  structural  validation,  and  con¬ 
ducting  output  validation.  These  specific 
tasks  are  listed  in  Table  rV-A-3. 

During  preparation  for  model  validation, 
it  is  critical  to  specifically  define  the  prob¬ 
lem  to  be  modeled  and  addressed  (that  is, 
the  problem  or  class  of  problems  for  which 
the  model  is  being  validated).  Definition  of 
the  problem  will,  to  a  large  extent,  establish 
the  intended  application  of  the  model.  It  is 
important  to  remember  that  model  validation 
is  accomplished  for  a  particular  problem 
(application)  or  class  of  problems.  One 
must  develop  and/or  select  the  appropriate 
scenario(s)  and  mission(s)  to  address  the 
problem  and  assess  their  realism  in  terms  of 
the  real  world.  The  MOEs  and  MOPs 
required  to  address  the  specifics  of  the 


problem  will  be  the  primary  model  outputs 
examined.  It  is  important  at  this  stage  to 
take  into  account  whether  or  not  this  model 
will  be  used  in  a  stand-alone  role  or  as  one 
level  in  a  mode!  hierarchy.  This  will  impact 
the  inputs  and  outputs  and  how  model  real¬ 
ism  needs  to  be  addressed. 

Once  the  above  tasks  are  completed, 
output  and  structural  validation  (required  to 
produce  data  to  address  the  MOEs  and 
MOPs)  can  be  addressed.  Generally,  the 
amount  of  output  validation  that  practically 
can  be  accomplished  will  influence  the 
amount  of  structural  validation  necessary. 
Quantitative  output  validation  normally  is 
performed  as  a  comparative  test  or  experi¬ 
ment  which  provides  a  quantitative  assess¬ 
ment  of  the  agreement  of  the  ^lodel  with  the 
real  world.  Sensitivities  of  the  model 
output  MOEs  and  MOPs  to  inputs,  critical 
model  logic,  and  assumptions  should  be 
identified  and  quantified  to  the  extent  practi¬ 
cal.  When  output  validation  cannot  be 
performed  at  the  full  model  level,  the  model 
should  be  decomposed  and  structural  valida¬ 
tion  applied.  As  discussed  earlier,  this 
involves  a  comparison  of  input  parameters, 
data  bases,  assumptions,  and  model  func¬ 
tions  with  the  real  world.  Sensitivity  analy¬ 
sis  should  always  be  a  key  tool  during  both 
structural  and  output  validation. 

4-A.3.S.  Stakeholders  In  Model  Validation 

The  model  developer,  user,  independent 
reviewer,  customer,  and  decision  maker  all 
have  an  interest  and  responsibility  in  the 
validation  of  a  model.  If  a  new  model  is 
being  developed  for  a  given  use  or  class  of 
uses,  the  model  developer  should  verify  and 
validate  the  model  to  the  extent  practical  and 
document  those  results.  (Verification,  as 
discussed  in  Chapter  3,  should  be  performed 


IV-ll 


Table  4-A.3.  Example  Tasks  for  Conducting  Model  Validation 


CATEGORY  TASKS 

Preparation  •  Define  specific  function/system  to  be  modeled  and  ad¬ 
dressed. 

•  Develop/select  level  of  model  required  {end  game,  one-on- 
one,  campaign,  etc.). 

•  Develop/select  scenario(s),  mission(s),  etc.  (address  reason¬ 
ableness  versus  real  world). 

•  Determine  whether  model  will  be  used  as  stand-alone  or  as 
one  level  in  a  hierarchy. 

•  Identify  specific  model  output  MOEs  and  MOPs  required  to 
address  the  problem. 

•  Identify  input  from  and  output  to  other  models. 

•  Select  and  implement  the  appropriate  category  or  categories 
of  validation. 

Output  •  Quantify  agreement  of  output  MOEs  and  MOPs  of  interest 

Validation  versus  real  world. 

•  Quantify  sensitivities  of  model  output  MOEs  and  MOPs  of  in¬ 
terest  to  inputs,  critical  model  logic,  assumptions,  and  pa¬ 
rameters. 

•  Conduct  face  validation  and  other  appropriate  forms  of 
expert  qualitative  assessments. 

9  Compare  input  scenarios,  parameters,  and  data  bases  versus 
real  world. 

•  Address  adequacy  of  inputs  from  other  models  and  outputs 

to  other  models. 

•  Address  assumptions  versus  real  world. 

Structural  •  Address  total  and  decomposed  model  functions  versus  real 

Validation  world. 

•  Address  sensitivities  of  model  output  MOEs  and  MOPs  of  in¬ 

terest  to  inputs,  critical  model  logic,  assumptions,  and  pa¬ 
rameters. 

•  Address  interdependency  of  logic  functions. 

•  Address  adequacy /completeness  of  model  logic. 

•  Address  Adequacy  of  model  in  context  of  model  hierarchy. 


routinely  as  part  of  the  programming  and  oper  selects  and  applies  a  model,  the  user 
checkout  phases  of  a  model’s  development.)  inherits  a  responsibility  for  properly  appiy- 

In  the  early  stages,  the  validation  effort  ing  the  model  as  well  as  conducting  any 

likely  will  be  more  structural-  than  output-  additional  verification  and  validation  neces- 

oriented.^  When  a  user  other  than  the  devel-  sary  for  the  problem  at  hand.  (The  user 
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will  also  want  to  ensure  that  the  model  or 
simulation  has  been  accredited  for  his  appli¬ 
cation,  as  discussed  in  Chapter  5.) 

When  critical  issues  are  to  be  addressed 
by  modeling  and  simulation,  it  is  beneficial 
to  have  someone  other  than  the  developer 
and  user  (i.e.,  an  independent  reviewer) 
conduct  additional  model  verification  and 
validation.  This  is  designed  to  provide  a 
separate  and  objective  look  and,  optimisti¬ 
cally,  offsets  any  biases  that  the  developer 
and  user  may  have.  The  customers  and 
decision  makers  also  have  a  high  stake  in 
model  verification  and  validation.  For 
example,  the  decision  to  procure  a  major 
weapon  system  may  be  based  largely  on 
model  results;  and  the  validity  of  the  deci¬ 
sion  could  depend  on  the  validity  of  the 
model.  Furthermore,  those  responsible  for 
accreditation  of  the  model  will  rely  on 
verification  and  validation  reports  in  arriving 
at  their  decision  (see  Chapter  5). 

As  discussed  in  Chapter  2,  it  is  essential 
to  maintain  configuration  control  of  models. 
When  there  are  multiple  users  of  a  model 
and  the  various  users  are  modifying  the 
model  to  accommodate  their  particular  needs 
or  usage  requirements,  then  each  version 
will  require  validation  (and  accreditation)  for 
that  particular  application. 

4-A.3.6.  The  Model  Validation  Plan 

Validation  of  any  given  model  should 
be  a  continuing  process  with  appropriate 
documentation  of  the  results  at  various  key 
application  points  throughout  the  life  cycle 
of  the  model.  A  formal  plan  should  be 
developed  to  conduct  this  validation.  The 
validation  plan  could  be  developed  sequen¬ 
tially  over  the  lifetime  use  of  the  model, 
with  a  basic  plan  covering  the  initial  valida¬ 
tion  and  supplements  as  needed  to  address 


each  unique  application  and/or  configuration 
update.  The  information  set  forth  in  the 
validation  plan  should  be  sufficient  to  sup¬ 
plement  other  program  and  decision  making 
documentation,  as  well  as  to  serve  as  the 
road  map  for  validation  of  the  model  at  a 
specific  point  in  time.  The  validation  plan 
and  report  should  be  the  key  documentation 
that  supports  the  decision  to  accredit  the  use 
of  a  model.  A  sample  format  illustrating  the 
types  of  information  that  should  be  included 
in  the  validation  plan  is  provided  below. 

I.  EXECUTIVE  SUMMARY 

n.  BACKGROUND 

•  Purpose  of  the  Validation 
Effort 

•  General  Description  of  the 
Model 

•  Previous  and  Planned  Usage 

•  Program  and  Decision  Making 
Structure 

in.  PROBLEM 

•  Specific  Problem(s)  Being  Ad¬ 
dressed 

•  •  MOEs  and  MOPs 

•  •  Critical  Evaluation  Issues 

•  Critical  Validation  Issues  (Re¬ 
lated  to  Critical  Evaluation  Is¬ 
sues) 

IV.  APPROACH 

•  Validation  Task(s)  To  Be  Ad¬ 
dressed 

•  General  Approach  to  Validation 
••  Scope 

••  Limitations 

•  Specific  Approach  to  Valida¬ 
tion  for  Each  Task 
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V.  DESCRIPTION  OF  VAUDATION 

EXPERIMENT 

•  Output 

•  Structural 

VI.  ADMINISTRAllON 

•  Validation  Management  and 
Schedule 

•  Tasking  and  Responsibilities 

•  Safety 

•  Security 

•  Environmental  Impact 

Vn.  REPORTING 

Vra.  ATTACHMEfTTS  (As  Required) 

The  background  information  in  the  vali¬ 
dation  plan  should  sufflciently  describe  the 
model,  its  present  configuration,  and  previ¬ 
ous  and  planned  applications.  It  should  also 
relate  how  the  model  fits  into  the  overall 
program  and  decision-making  structure. 
The  specific  problem  that  the  model  is  to 
address  should  be  clearly  delineated  along 
with  the  MOE(s)  and  MOP(s)  of  interest. 
Critical  evaluation  issues  related  to  the 
problem,  along  with  the  specific  tasks 
planned  for  the  validation  process,  should  be 
addressed.  Critical  validation  issues  should 
be  identified  and  addressed  in  terms  of  how 
they  relate  to  the  critical  evaluation  issues 
that  must  be  addressed  by  the  model.  The 
general  and  specific  approaches  to  be  used 
for  validation  tasks  should  be  addressed. 
Planned  validation  experiments  for  both 
output  and  structural  validation  should  be 
described.  Finally  such  tasks  as  scheduling, 
planning  for  administration  and  manage¬ 
ment,  tasking  and  responsibilities,  and 
reporting  on  the  validation  efforts  should  be 
formally  documented  in  the  plan. 


4-A.3.7.  Documentation  of  the  Model 
Validation  Efforts 

Formal  documentation  of  the  model 
validation  activities  and  reports  for  each 
model  validation  effort  are  essential  for 
proper  life  cycle  management  of  the  model. 
The  documentation  and  reports  should  be 
directed  at  assisting  the  customer  and  the 
decision  maker  in  the  model  accreditation 
process  (i.e.,  it  should  provide  information 
that  helps  the  decision  maker  decide  whether 
the  model  is  good  enough  for  the  specific 
application  and  problem  being  addressed). 
Documentation  of  prior  validation  efforts 
also  assists  those  tasked  to  conduct  subse¬ 
quent  validation  efforts  (i.e.,  it  should  pro¬ 
vide  the  basis  for  accumulating  validation 
information).  The  documentation  efforts 
include  collection  of  information  and  data  as 
the  validation  plan  is  executed  and  analysis 
and  reporting  of  the  model  validation  re¬ 
sults. 

A  sample  format  illustrating  the  types  of 
inforjTiation  that  are  of  interest  in  the  valida¬ 
tion  report  is  provided  below.  Ideally,  the 
executive  summary  will  concentrate  on  the 
model,  critical  issues  regarding  its  applica- 
tion(s),  and  its  strengths  and  weaknesses  for 
addressing  the  specific  problem(s)  in  terms 
of  the  real  world  comparisons  made  during 
validation.  Sections  II  through  V  provide 
background  on  what  was  planned  for  model 
validation  and,  except  as  modified,  could  be 
extracted  from  the  plan.  Sections  VI,  Vn, 
and  vm  address  both  general  and  specific 
results  of  the  validation  effort  along  with  a 
detailed  accounting  of  the  specific  validation 
findings.  Model  trends  and  sensitivities,  as 
they  relate  to  the  problem  and  the  critical 
evaluation  issues,  should  be  described, 
(^antitative  methods  are  highly  desirable 
for  the  validation  comparisons  and  should  be 
documented.  Qualitative  methods  also  are 


useful  and,  because  they  usually  include 
more  subjective  Judgments,  may  require 
even  more  documentation. 

I.  EXECUTIVE  SUMMARY 

n.  BACKGROUND 

in.  PROBLEM 

IV.  APPROACH 

V.  DESCRIPTION  OF  VAUDA- 

TION  EXPERIMENT 

•  Output 

•  Structural 

VI.  RESULTS  OF  VALIDATION 

EFFORTS 

.•  General 

•  Validation  by  Task 

•  Discussion  of  Critical  Issues 

•  Discussion  of  Model  Trends 
and  Sensitivities 

Vn.  SPECIFIC  VALIDATION 

HNDINGS 

Vm.  ATTACHMENTS  (As  Required) 

4-A.3.8.  Special  Considerations  When 
Validating  Models 

Model  validation  will  always  require 
some  level  of  judgment,  but  to  the  maxi¬ 
mum  extent  possible,  empirical  comparisons 
should  be  made.  As  discussed  earlier,  the 
process  of  validating  a  model  will  never  be 
'exhaustive.  There  will  always  be  some 
things  not  addressed  during  validation,  and 
others  not  addressed  to  the  degree  that  some 
would  like.  Consequently,  for  model  vali¬ 
dation  to  be  a  productive  process,  it  must 
concentrate  on  the  specific  uses  of  the  model 
and  the  actual  validation  issues  addressed 


regarding  those  specific  uses.  The  valida¬ 
tion  comparisons  should  present  a  reason¬ 
able,  systematic  examination  of  the  model, 
and  an  objective  picture  of  its  true  capabili¬ 
ties  and  limitations  in  that  application.  Both 
the  strengths  and  weaknesses  of  the  model 
for  addressing  the  stated  problem(s)  must  be 
communicatol  to  the  accreditor  by  the  vali¬ 
dation  process. 

The  validation  process  should  be  exten¬ 
sive  and  robust  enough  to  properly  consider 
the  findings  and  views  of  neutrals,  advo¬ 
cates,  adversaries,  and  other  interested 
parties.  The  goal  should  be  to  communicate 
all  important  findings  regarding  model 
comparisons  and  critical  validation  issues  to 
the  accreditor.  When  serious  competing 
views  emerge  on  critical  validation  issues,  it 
may  be  necessiry  to  pursue  further  valida¬ 
tion  efforts  th^  can  provide  additional  ob¬ 
jective  comparisons  and  information  for 
consideration  by  the  responsible  accred  ta- 
tion  authority. 

There  should  be  test  data  available  for 
comparison  on  each  critical  issue  to  be  ad¬ 
dressed  by  the  model.  If  feasible,  it  is 
desirable  to  collect  two  sets  of  real  world 
data  —  one  for  structural  comparisons  and 
another  for  output  comparisons.  The  valida¬ 
tion  process  should  be  such  that  when  data 
derived  from  realistic  field  and  development 
testing  raise  questions  about  prior  assump¬ 
tions  and/or  propositions  of  the  model,  these 
questions  are  addressed.  The  process  of 
validation  must  shed  light  on  what  we  do 
and  do  not  know  about  the  model's  structur¬ 
al  content,  its  internal  functions  and  capabil¬ 
ities,  and  its  output  accuracy.  Our  analysis 
of  conflicting  or  discrepant  information 
often  provides  the  insights  necessary  for 
improving  the  models  and  obtaining  better 
answers  to  difficult  questions. 


IV-15 


Independent  technical  and  operational  ex¬ 
perts  can  examine  the  model  processes,  its 
assumptions,  inputs,  and  outputs  to  arrive  at 
their  opinions  of  the  appropriateness  and 
validity  of  the  model  and  its  associated 
results.  In  the  academic  world  and  in  the 
field  of  operations  research,  this  independent 
review  is  often  performed  by  a  separate 
unbiased  party  (i.e.,  a  referee)  who  is  re¬ 
sponsible  for  helping  maintain  the  objectivity 
of  the  analysis.  Unfortunately,  when  deal¬ 
ing  with  highly  complex  and  often  classified 
systems  and  techniques,  this  objectivity  can 
be  somewhat  limited,  especially  if  documen¬ 
tation  is  not  adequate.  Therefore,  it  is 
incumbent  upon  all  interested  parties  (e.g., 
model  developers,  users,  decision  makers, 
and  other  responsible  authorities)  to  ensure  . 
that  the  validation  process  is  objective, 
comprehensive,  and  well  documented. 

4-A.4  SUMMARY 

The  Military  Operations  Research  Soci¬ 
ety  advocates  the  formal  determination  and 
documentation  of  model  credibility  through 
a  three-part  investigation  involving  verifica¬ 
tion,  v^idation,  and  accreditation.  This 
chapter  addressed  the  second-part,  model 
validation,  which  is  defined  as  "the  process 
of  determining  the  degree  to  which  a  model 
is  an  accurate  representation  of  the  real 
world  from  the  perspective  of  the  intended 
uses  of  the  model.” 


A  model  can  be  portrayed  in  terms  of  its 
inputs,  the  model  itself,  and  its  outputs. 
Thus,  when  validation  comparisons  are 
made  against  the  real  world  or  physical 
theories  and  laws  associated  with  the  real 
world,  they  can  be  addressed  in  terms  of 
model  inputs,  the  model  itself,  and  model 
outputs. 

Model  validation  has  been  partitioned 
into  two  parts:  (1)  output  validation  and  (2) 
structural  validation.  Output  validation  is 
the  most  credible  form  of  validation  and 
consists  of  comparing  the  output  of  the 
model  against  real  world  observations. 
Structural  validation  involves  determining 
the  extent  to  which  the  input  data  (e.g., 
scenario(s),  mission(s),  databases,  etc.) and 
the  modei  (e.g.,  assumptions,  logic,  algo¬ 
rithms,  parameters,  code  execution,  etc.) 
represent  the  significant  and  s:.!ient  features 
of  the  of  real  world  systems,  events,  and 
scenarios.  Decomposition  of  tiie  model  into 
fundamental  model  functions  and  compo¬ 
nents  aids  in  the  process  of  structural  valida¬ 
tion. 

Validation  of  complex  models  requires 
an  appropriate  combination  of  both  structur¬ 
al  and  output  validation.  Maintaining  con¬ 
figuration  control  and  essential  documenta¬ 
tion  are  important.  A  formal  plan  for  model 
validation,  along  with  adequate  reporting 
and  documentation  of  the  results  as  de¬ 
scribed  herein,  are  vital  parts  of  the  model 
validation  process. 
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PART  B  -  THE  MULTU)BVIENSIONAL  SPACE  OF  VALmATION 

by  Dale  Henderson 


4-B.O  INTRODUCTION 

Almost  any  discussion  of  the  kinds  of 
validation  and  of  the  activities  involved 
quickly  becomes  confused  and  confusing. 
One  rea.son  for  this  is  that  the  "space"  of 
validation  activities  has  many  separable 
dimensions.  If  discussants  do  not  first 
describe  this  space  they  run  a  great  risk  of 
individually  focusing  on  a  different  "coordi¬ 
nate"  in  this  space;  they  are  then  discussing 
different  aspects  of  the  problem  using  much 
the  same  words.  This  limits  agreement  to 
the  superficial  and  generally  results  in  con¬ 
fusion. 

We  have  decomposed  the  space  of  vali¬ 
dation  or  validation  activities  into  five  com¬ 
ponent  dimensions  in  Figure  IV-B-1.  These 
.separate  dimensions  describe  (1)  the  tech¬ 
niques  used,  (2)  the  basis  of  truth  used,  (3) 
the  applications  intended  for  the  model  or 
simulation,  (4)  the  degree  of  composition  of 
the  model,  and  (S)  the  depth  of  the  vali¬ 
dation  effort  itself.  We  also  indicate  a  possi¬ 
ble  sixth  dimension,  the  verification  of  the 
software  against  its  own  standard  or  stan¬ 
dards.  This  sixth  dimension  would  really  be 
found  to  compose  several  dimensions,  which 
may  largely  be  discussed  without  reference 
to  the  present  five.  That  is,  it  composes  a 
separate  space  which  is  studied  in  earlier 
sections  of  this  monograph. 

A  given  activity  in  evaluating  the  validi¬ 
ty  of  a  model  or  simulation  will  generally  be 
a  point  in  this  multidimensional  space;  but, 
not  all  points  are  occupied  by  validation 
efforts.  For  instance,  the  activity  called 


Face  Validation  is  easily  identified.  Its 
depth  is  usually  shallow  or  at  the  surface. 
The  Delphi  technique  is  most  common. 
And  the  model  applications  assumed  are 
most  often  analysis.  The  activities  of  Face 
Validation  are,  however,  less  localized  along 
the  other  two  axes:  the  degree  of  decompo¬ 
sition  of  the  model  is  generally  irrelevant 
and  the  basis  of  truth  is  commonly  history, 
trial  data,  or  theory. 

4-B.l  APPLICATION 

The  application  coordinate  is  meant  to 
recognize  that  models  are  employed  in 
qualitatively  different  fashion  and  that  what 
is  even  meant  by  validation  depends  on  the 
sort  of  application.  We  indicate  three  kinds 
of  applications  which  form  a  rough  progres¬ 
sion;  synthesis,  analysis,  and  prediction.  A 
synthetic  application  is  the  use  of  a  model 
to  provide  meaning  or  consistency,  to  fur¬ 
ther  human  understanding,  but  without 
adding  any  new  knowledge;  synthesizing 
understood  pieces  into  a  larger  whole. 
Training  models  and  simulations  fall  into 
this  category.  An  analytic  application  is  one 
to  further  understanding  by  providing  a 
structure  for  further  abstraction;  human 
understanding  is  extended  through  their  use. 
Such  applications  could  be  tests  of  historical 
records  of  battle  against  purported  explana¬ 
tions  or  correlations.  A  predictive  applica¬ 
tion  is  just  that:  a  prediction  of  something 
new  which  can  then  be  observed  for  its 
degree  of  compliance.  (These  distinctions 
are  well  developed  in  the  RAND  report  "Is 
It  You  or  Your  Model  Talking?  A  Frame¬ 
work  for  Model  Validation,”  by  James  S. 
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Figure  IV-B-I .  The  Multidimensional  Space  of  Validation 


Hodges  and  Janies  A.  Dewar,  R-4il4- 
AF/A/OSD,  1992.) 

4-B.2  T^.UTH  BASIS 

The  uasis  of  truth  coordinate  represents 
the  differing  norms  against  which  a  model 
under  consideration  may  be  judged:  "What 
is  correct?"  Obvious  data  include  historical 
records  (generally  of  combat),  data  from 


field  trials,  the  output  from  other  models, 
and  a  priori  theory.  No  matter  which  bases 
are  used,  the  correctness  and  applicability  of 
the  data  should  be  explicitly  examined. 
Unquestioned  assumptions  of  applic;*bility 
are  especially  dangerous. 

4-B.3  TECHNIQUE 

'’lie  several  techniques  are  arranged  in 
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an  order  indicating  the  amount  of  data  con¬ 
sidered  in  the  process.  The  Delphi  process 
consists  of  a  group  of  purported  topical 
experts  functioning  in  some  structured  man¬ 
ner.  Their  analyses  may  skip  between 
surface  values,  trends,  consideration  of  the 
implications  of  the  composition  of  the  mod¬ 
el,  or  implications  of  the  algorithms  em¬ 
ployed.  While  the  volume  of  data  consid¬ 
ered  will  be  small,  the  genius  of  the  process 
is  that  it  will  be  the  most  appropriate,  being 
somehow  selected  by  the  panel  of  experts. 

Special  cases  are  specified  scenarios  or 
instances  for  which  we  have  some  reason  to 
believe  that  the  correct  results  are  known. 
Often  these  are  limiting  or  extreme  cases 
under  which  many  factors  become  unimpor¬ 
tant  or  under  which  algorithms  mathemati¬ 
cally  simplify  to  analytically  known  results. 
Other  special  cases  could  include  field  trial 
results  for  which  measurements  exist. 
Comparisons  are  similar,  but  involve  more 
data  or  trends  of  data. 

Appeals  to  a  truth  basis  from  an  accept¬ 
ed  model  are  usually  through  a  fairly  ex¬ 
haustive  set  of  input  and  output  compari¬ 
sons.  It  is  also  common  to  employ  graphi¬ 
cal  presentations  because  people  can  often 
extract  (qualitatively  unmeasured)  trends 
from  appropriate  graphical  presentations. 
The  most  efficient  comparisons,  to  any  set 
of  truth  data,  employ  statistical  sampling, 
well-designed  numerical  experiments  to 
explore  as  many  sensitivities  as  possible 
with  each  case  considered. 


4-B.4  COMPOSITION 

If  a  model  represents  one  thing,  by 
itself,  with  all  external  interfaces  through 
parametric  and  known  data,  then  a  monolith¬ 
ic  model  is  appropriate.  Even  without 
meeting  these  restrictions,  they  are  common¬ 
place,  (albeit  often  only  because  of  poor 
software  design).  At  the  other  extreme  are 
models  and  simulations  composed  of  objects 
which  are  in  turn  composed  of  parts  or 
elements.  This  composition  is  in  accord 
with  modern  software  practice,  but  raises 
several  issues  with  respect  to  validation. 
First,  the  separate  and  separable  objects  can 
be  considered  separately;  different  tech¬ 
niques  and  tnith  bases  being  applied  to 
different  parts.  But  the  interactions  between 
objects  must  be  confirmed  as  an  additional 
consideration.  The  middle  case,  labeled 
amalgamated,  is  just  meant  to  indicate  the 
intermediate  case  involving  a  few  objects, 
these  perhaps  having  been  abstracted  from  a 
.  her  ensemble  for  some  purpose. 

4-B.5  DEPTH 

The  depth  dimension  is  a  measure  of  the 
degree  of  quantitative  detail.  Surface  mea¬ 
sures  are  just  that,  often  applied  in  Face 
Validation  by  people.  Formal  measures  are 
quadratures  or  other  abstractions,  perhaps 
correlatiors  among  the  data  themselves, 
which  are  of  primary  interest  to  decision 
makers.  The  detailed  variables  involved  in 
a  calculation  (all  of  them)  are  ultimately 
available  for  comparison  with  (say)  field 
trial  data  or  with  the  results  from  other, 
accepted,  models. 

See  Appendix  A  for  Selected  Bibliography 
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CHAPTER  4 


PART  C  -  FACE  VALIDATION  AND  FACE  VALIDITY 

by  D.  P.  Gaver 


FACE  VALIDATION  is  the  process  of 
using  informed  experts  to  examine 
the  conceptual  background,  execu¬ 
tion,  and  output  of  a  simulation  mod¬ 
el. 


The  examination  begins  by  a  review  of 
the  operational  questions  that  the  model  is 
designed  to  address;  thus  the  relevance  of 
the  model  is  first  examined.  These  ques¬ 
tions  will  often  be  quantitatively  expressed 
("How  much  of  Items  x,  y,  z  should  I  stock 
at  locations  u,  v,  w  for  quarter  3  of 
1993?"),  and  degree  of  success  in  answering 
them  correctly  may  be  expressed  quantita¬ 
tively,  i.e.,  in  terms  of  Measures  of  Perfor¬ 
mance  (MOPs)  and  Measures  of  Effective¬ 
ness  (MOEs),  The  experts  performing  face 
validation  will  first  ask  if  answers  to  the 
questions  addressed  by  the  model  will  assist 
the  client  decision  maker.  Then  the  expert 
will  comment  on  the  way  those  quantitative 
questions  are  answered:  are  the  answers 
comprehensible  to  the  client,  are  the  restric¬ 
tions  implicit  in  the  modeling  approach 
made  clear  so  that  the  model  will  not  be 
misused,  are  uncertainties  in  the  model 
conclusions  adequately  portrayed  as  these 
depend  upon  model  inputs  (data  on  parame¬ 
ter  values,  etc.)  and  organizational  structure 
and  behavior?  The  experts  performing  this 
first  stage  of  face  vali^tion  are  conducting 
an  overview  for  model  relevance,  and  us- 
ahility. 

The  second  state  of  face  validation  is  an 
independent  overall  assessment  of  the  credi¬ 


bility  of  detailed  model  output.  This  can  be 
approached  initially  by  checking  whether 
apparently  correct  numerical  values  result  in 
known  cases,  i.e.,  when  certain  parameter 
values  are  specified,  e.g.,  set  equal  to  zero. 
Obedience  to  physical  laws  can  be  checked, 
if  relevant.  Correspondence  to  other 
models’  outputs  can  be  checked;  such  other 
models  can  be  simple  "back  of  the  enve¬ 
lope"  creations  of  the  experts  themselves. 
The  face  validation  experts  might  also  ask, 
parenthetically,  what  the  present  model 
offers  that  an  existing  model  does  not.  In 
the  process  the  model’s  output  options  can 
be  critiqued:  are  there  informative  graph¬ 
ics?  Are  tables  of  numbers  arranged  so  that 
their  implications  are  clear?  Are  numerical 
results  expressed  to  credible  accuracy,  not  to 
absurdly  many  significant  digits?  Are  error 
assessments  of  results  given  in  a  believable 
and  comprehensible  way  (documentation 
should  cover  this)?  All  of  these  steps  ad¬ 
dress  the  overall  question  of  how  well  the 
model  does  what  it  advertises  to  do. 

The  flavor  of  face  validation  is  that  the 
above  steps  are  carried  out  relatively  quickly 
by  one  or  more  experts  in  the  subject-matter 
area  covered  by  the  model  (e.g.,  theater 
level  modeling,  air  defense,  logistics,  anti¬ 
submarine  warfare,  intelligence).  The  result 
of  the  face  validation  process  can  be  either 
an  endorsement  of  the  model  as  is,  sugges¬ 
tions  for  model  revision,  or  outright  model 
rejection.  It  is  desirable  that  a  model  under 
development  be  subjected  to  a  face  valida¬ 
tion  process  during  the  development  process. 
It  seems  especially  expedient  that  exposure 
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to  face  validation  processes  by  the  ultimate 
client-user  and  his  resident  experts  be 
conducted  at  intervals  during  the  develop¬ 
ment  process. 
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CHAPTER  4 


PART  D  -  SENSITIVITY  STUDY  OF  A  SIMULATION  MODEL 

by  D.  P.  Gaver 


The  sensitivity  study  of  a  model  has 
several  aspects.  In  general  such  a  study  is 
made  in  order  to  check  for  plausibly  proper 
model  response  to  different  levels  of  input 
variables  over  their  jointly  appropriate 
ranges.  It  is  often  easiest  to  check  respons¬ 
es  to  special  limiting  values,  at  which  the 
proper  response  is  known  from  very  simple, 
common-sense  considerations  or  quite  basic 
physical  laws.  By  response  is  meant  the 
quantifiable  or  classifiable  outcome  of  an 
operation  or  experiment  that  the  model  is 
supposed  to  predict  or  represent.  In  the 
context  of  military  applications  a  response 
could  be  weapon  delivery  accuracy  as  a 
function  of  delivesing  platform’s  speed 
relative  to  the  target,  maneuvering  actions, 
and  prt;tective  jamming  or  use  of  diversion¬ 
ary  decoys  by  the  target. 

A  secondary,  but  important,  feature  of 
sensitivity  testing  is  to  assess  the  degree  of 
model  response  sensitivity  to  convenient 
modeling  assumptions  that  cannot  necessari¬ 
ly  be  checked  for  validity.  For  instance, 
weapon  dispersions  from  aimpoints  might  be 
typically  taken  to  be  independently  normal- 
or  Gaussian-distributed;  times  to  equipment 
failure  might  be  represented  as  exponentially 
distributed  random  variables;  and  arrivals  of 
messages  at  a  communication  center  as  a 
Poisson  process.  It  is  useful  to  examine 
responses  when  such  assumptions  are  re¬ 


laxed  in  plausible  ways.  Insensitivity  of 
response  when  the  strict  assumptions  are 
relaxed,  given  the  values  of,  say,  the  mean 
of  the  corresponding  distributions,  is  a 
reassuring  virtue,  for  no  simple  model  of  a 
system  components  can  be  guaranteed  to 
hold  precisely 

Sensitivity  studies  are  typically  conduct¬ 
ed  by  comparing  model  results,  i.e.,  the 
pattern  of  responses  when  input  variable 
values  are  changed,  to  comparable  results 
from  other  validated  models,  to  relevant 
experimental  or  field  data.  The  input  of 
expert  judgment  can  also  provide  valuable 
guidance. 

A  systematic  and  well-conducted  sensi¬ 
tivity  study  will  help  isolate  errors  or  omis¬ 
sions  in  the  model’s  structural  formulation 
as  well  as  in  the  enabling  computer  code.  It 
will  usefully  employ  expertise.  It  will 
identify  information  and  data  needs  and 
criticality. 

Since  many  important  models  must 
represent  one  or  more  meaningful  responses 
in  terms  of  many  input  variables  it  can  be 
anticipated  that  the  use  of  systematic  experi¬ 
mental  design  tools,  such  as  fractional  facto¬ 
rial  and  response  surfaces,  should  and  do 
prove  useful  for  better  understanding  a 
model’s  behavior. 
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CHAPTER  5  -  ACCREDITATION 

by  Ernest  Segiie  and  Patricia  Sanders 


The  official  determination  that  a 
model  is  accepted  for  a  specific  pur¬ 
pose  is  called  ACCREDITATION. 


5.0  INTRODUCTION 

Operations  Research  applies  the 
scientific  method  to  the  analysis  of  military 
operations  and  the  utilization  of  military 
assets.  Operations  researchers  use  the 
powerful  technique  of  computer  modeling  to 
form  their  conclusions.  The  British,  who 
invented  it,  define  OR  as: 

the  attack  of  modem  science 
on  complex  problems  arising 
in  the  direction  and  manage¬ 
ment  of  large  systems  of 
men,  machines,  materials  and 
money  in  industry,  business, 
government  and  defense.  Its 
distinctive  approach  is  to 
develop  a  scientific  model  of 
the  system,  incorporating 
measurements  of  factors  such 
as  chance  and  risk,  with 
which  to  predict  and  compare 
the  outcomes  of  alternative 
decisions,  strategies  or  con¬ 
trols.  The  purpose  is  to  help 
management  detennine  its 
policy  and  actions  scientifi¬ 
cally. 

Such  policy  and  actions  are  usually 
implemented  by  a  decision  maker,  who  may 
start  by  knowing  little  about  the  techniques 
and  methods  of  operations  research  and 
nothing  about  the  models  used  during  the 
research. 


The  researcher  must  communicate  to 
the  decision  maker  the  results  of  the  re¬ 
search  and  something  else  as  well:  the  ap¬ 
propriate  measure  of  confidence  in  the 
results.  For  the  decision  maker,  to  act  on 
the  results  of  the  model  research  is  to  give 
credence  (in  some  degree)  to  the  model. 
The  decision  maker  must  have  some  reason 
for  believing  that  the  model  is  acceptable  for 
the  purpose  to  which  it  was  put. 


This  chapter  is  written  from  the  point 
of  view  of  the  operations  researcher  who 
must  get  a  model  accredited.  For  the  opera¬ 
tions  researcher,  the  accreditation  process  is 
the  way  to  provide  what  the  decision  maker 
needs  in  order  to  give  the  appropriate  cre¬ 
dence  to  the  model  and  its  use.  The  formal 
process  of  accreditation  should  mirror  the 
processes  that  accompany  all  research:  the 
process  of  convincing  oneself  that  the  meth¬ 
ods  and  results  are  reasonable,  appropriate, 
and  worth  believing  to  some  extent,  and  of 
assessing  the  confidence  one  has  in  the 
results. 

In  simpler  times  accreditation  could 
be  done  in  ad  hoc  fashion.  The  decision 
maker  knew  the  researcher  and  the  quality 
of  the  researcher’s  work  built  on  possibly 
years  of  interaction.  The  researcher  knew 
the  decision  maker  personally  and  could  ask 
what  was  important  and  what  was  the  real 
question.  Such  a  close  partnership  charac¬ 
terizes  some,  but  not  most,  efforts  today. 
Often  there  arc  several  layers— or  filters— of 
management  between  the  researcher  and  the 
decision  maker.  As  a  result,  more  care 
must  be  taken  in  the  communication  process. 
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Accreditation  is,  in  part,  a  response  to  the 
gap  that  has  developed  between  the  opera¬ 
tions  researcher  and  the  decision  maker.  If 
used  properly,  the  accreditation  process  can 
reestablish  the  close  partnership  that  is 
necessary  to  do  relevant,  and  credible  work. 
The  aim  of  accreditation  is  the  mutual  agree¬ 
ment  by  the  researcher  and  decision  maker 
on  the  extent  to  which  the  model  can  be  the 
basis  of  decision. 

In  sum,  Quade  and  Carter  note  in  a 
discussion  of  the  "The  Modeler’s  Versus  the 
Decision  Maker’s  View  of  Quality,"  that 
".  .  .operations  research  has  lost  its  most 
important  roles  because  it  has  devolved  from 
a  market  orientation  based  on  the  client’s 
needs  to  a  professional  orientation  based  on 
tool  development."’ 

5.1  DEFINITION  OF  ACCREDITA¬ 
TION 

Accreditation  is  the  official  determi¬ 
nation  that  a  model  is  acceptable  for  a  spe¬ 
cific  purpose.  But  what  makes  it  different 
from  Validation?  Validation  is  the  process 
of  determining  the  degree  to  which  a  model 
is  an  accurate  representation  of  the  real 
world  from  the  perspective  of  the  intended 
uses  of  the  model.  Every  model  is  a  sim¬ 
plification  of,  and  a  distortion  of,  the  real 
world.  A  determination  of  complete  validity 
is  therefore  impossible.  No  validation  is 
expected  without  many  caveats.  The  best 
caveats  (1)  specify  the  range  of  values  of 
variables  over  which  the  model  has  been 
checked  (its  field  of  validity)  and  then  (2) 
specify  the  error  that  the  model  generates 
within  the  specified  input  field  (degree  of 
validity). 

The  degree  of  validity  must  reflect 
the  error  in  the  output  of  the  model.  In 
order  to  determine  the  error,  data  must  be 


available  or  a  test  must  be  run  against  which 
to  compare  the  model  predictions.  The 
result  of  a  validation  comparison  might  be, 
for  example,  that  the  model  is  only  good  to 
an  order  of  magnitude,  or  that  the  model  is 
able  to  calculate  the  range  of  detection  in 
free  space  to  within  x,  if  the  Radar  Cross 
Section  (RCS)  of  the  target  is  greater  than  z 
and  known  to  y.  The  validation  should  also 
address  the  nature  of  the  error— whether  it  is 
systematic  or  random,  and  what  aspect  of 
the  model  is  causing  the  error.  Validation 
should  discuss  the  data  base  from  which  the 
model  was  derived  and  the  data  base  to 
which  the  model  outputs  were  compared  to 
determine  the  error. 

The  output  from  the  validation  may 
say  in  summary  that,  "When  employed  to 
evaluate  force-on-force  engagements  of 
battalion  ske,  the  loss  exchange  ratio  of  a 
single  run  may  differ  from  other  single  runs 
by  a  factor  of  three.  They  differ  from 
typical  training  exercises  by  factors  of  from 
two  to  ten  and  differ  from  the  experience  of 
actual  combat  (not  used  in  the  development 
of  the  model)  by  factors  of  two  to  a  thou¬ 
sand.  "  Validation  says  a  lot  about  the  mod¬ 
el  from  the  perspective  of  the  intended  uses 
of  the  model.  But  validity  is  separate  from 
the  specific  use  of  the  model  in  a  specific 
decision  process. 

Preparing  an  accreditation  begins  by 
understanding  how  the  model  outputs  are  to 
be  used  in  the  decision  process.  This  under¬ 
standing  is  also  the  starting  point  of  the 
operations  research  itself.  What  is  the 
question?  What  are  the  information  needs  to 
answer  the  question?  Why  do  you  need  the 
model?  What  will  the  model  produce  that  is 
important  for  the  answer  to  the  question? 
Accreditation  will  be  the  determination  that 
the  model  outputs  are  important  to  the  deci- 
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sion,  and  the  determination  that  the  degree 
to  which  the  model  represents  the  "real 
world"  is  sufficient  for  the  purpose  to  which 
the  model  will  be  put  in  the  decision  pro¬ 
cess;  that  is,  accreditation  must  account  for 
the  specific  use. 

For  example,  in  the  case  of  exchange 
ratios  discussed  above,  if  the  approximate 
difference  in  cost  of  two  alternatives  is 
expected  to  be  30%,  this  may  imply  that 
perfonnance  differences  of  that  order  are 
important  to  know.  Note  that  this  is  a 
question  not  directly  addressed  in  any  of  the 
validity  caveats  mentioned  above. 

If  the  detection  model  above  is  u.sed 
to  explore  whether  holes  in  sensor  coverage 
appear,  such  a  determination  will  depend  on 
the  density  of  sensors  as  well  as  the  model. 
That  the  RCS  will  be  known  only  to  8  may 
mean  that  the  model  cannot  used  to 
explore  whether  or  not  holes  open  up  in  the 
sensor  coverage. 

Thus  the  accreditation  process  must 
take  the  demonstrated  degree  of  agreement 
with  the  "real  world"  and  assess  the  signifi¬ 
cance  of  the  known  limitations  to  the  intend¬ 
ed  specific  use  at  hand.  It  must  represent 
how  the  model  is  intended  to  be  used  (by 
the  decision  maker)  in  the  decision  process, 
and  it  must  assess  the  degree  of  risk  in  using 
the  model  in  that  way.  The  validation 
process  is  only  the  first  step  in  such  an 
assessment,  and  as  such  may  miss  the  spe¬ 
cial  features  of  the  specific  use.  As  a  result, 
additional  sensitivity  analyses  may  be  re¬ 
quired  in  order  to  come  to  any  conclusion 
before  the  official  detennination  that  a 
model  is  acceptable  for  a  specific  purpose. 

Certain  features  from  the  definition 
of  accreditation  should  be  stressed: 


It  is  official.  The  accreditation  is 
performed  by  the  decision-making  official. 
This  flows  from  the  responsibility  of  the 
decision  maker,  which  cannot  be  transferred 
to  a  computer,  a  computer  model,  or  anoth¬ 
er  individual.  The  decision  maker  must, 
ipso  facto,  believe  in  the  tools  used  to  pro¬ 
vide  the  information  to  make  the  decision. 

The  operations  researcher’s  »ask  is  to 
define  clearly  why  to  believe  and  how  much 
to  believe  in  this  particular  situation.  The 
researcher  has  a  right  to  know  what  is  im¬ 
portant  to  the  decision  maker.  The  transfer 
of  belief  requires  a  relationship  between  the 
-researcher  and  the  decision  maker  that  is 
essential  to  the  proper  functioning  of  opera¬ 
tions  research.  During  the  filtering  that 
goes  on  as  a  report  goes  from  the  researcher 
to  the  decision  maker,  the  critical  assump¬ 
tions  and  caveats  too  often  get  lost. 

It  is  a  detennination. ,  A  decision  has 
to  be  made,  therefore  accreditation  is  more 
than  a  process.  How  to  make  that  detenni¬ 
nation  should  emerge  from  a  dialogue  be¬ 
tween  the  operations  researcher  and  the 
decision  maker. 

It  includes  a  definition  of  what  is 
acceptable.  There  must  be  criteria  for 
"good  enough"  on  which  the  researcher  and 
the  decision  maker  agree.  The  notion  of 
"good  enough"  can  be  the  subject  of  an 
analysis.  It  can  be  based  on  the  conse¬ 
quences  —  the  risk  assessment  for  wrong 
decisions,  or  the  cost  of  buying  more  confi¬ 
dence. 

Accredit  with  respect  to  a  specific 
purpose.  Assessing  the  risks,  costs,  and 
consequences  requires  that  the  specific 
application  be  known.  When  the  researcher 
does  not  know  the  u.se  to  which  the  research 
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is  put,  the  results  of  the  research  could 
easily  be  misapplied. 

Accreditation,  then,  is  an  official 
decision  that  a  model  appears  suitable  to 
study  a  specific  problem  or  issue.  Accredi¬ 
tation  goes  beyond  validation  in  that  it 
makes  a  judgment  taking  into  account  the 
lack  of  "complete  validation."  During  the 
accreditation  pnxess  (which  precedes  the 
use  of  the  model),  there  is  clearly  no  deci¬ 
sion  to  accept  the  model’s  results. 

5.2  ACCREDITATION:  THE  NUTS 
AND  BOLTS 

As  Quade  and  Carter  note: 

.  .  .the  argument  for  paying 
considerable  attention  to 
procedures  for  winning  ac¬ 
ceptance  from  the  client  and 
the  staff  as  opposed  to  sole 
dependence  on  the  logic  of 
the  analysis  for  that  purpose 
is  that  if  the  findings  of  a 
policy  analysis  fail  to  influ¬ 
ence  the  relevant  decision 
makers,  then  that  analysis,  as 
a  piece  of  policy  oriented 
research,  did  not  accomplish 
its  puipose,  no  matter  how 
good  it  might  seem  in  the  ab¬ 
stract  or  to  other  analysts. 

Immediate  acceptance 
of  all  aspects  of  an  analysis, 
however,  is  rarely  to  be 
expected;  acceptance  of  ideas 
takes  time.  To  be  listened  to 
and  carefully  considered  is  a 
practical  goal,  even  though 
not  a  completely  satisfactory 
one.^ 


To  gain  the  acceptance  we  call  ac¬ 
creditation,  we  recommend  that  the  re¬ 
searcher  start  early,  have  a  plan,  have  a 
team,  have  a  methodology,  and,  in  the  end, 
have  a  document. 

Accreditation  should  follow  careful 
verification  and  validation.  When  can  the 
operations  researcher  go  to  the  decision 
maker  expecting  to  get  accreditation  in 
writing?  When  can  model  accreditation 
reasonably  be  expected  to  be  attained?  Only 
after  the  results  are  in,  when  you  have  a 
chance  to  see  what  surprises  may  be  in 
store.  It  would  be  irresponsible  to  accredit 
a  model  before  seeing  that  the  results  make 
even  the  vaguest  sense,  or  learning  what 
aspect  of  the  model  drives  the  particular 
results  that  are  supposedly  relevant  to  the 
decision  at  hand.  Accreditation  should  not 
be  constrained  to  preset  criteria,  although 
criteria  can  be  established  as  part  of  a  spe¬ 
cific  accreditation  effort.  Does  this  mean 
that  accreditation  is  of  the  model  results? 
No.  The  results  could  turn  out  to  be  useless 
for  the  specific  decision,  even  when  those 
results  are  valid  (known  to  be  accurate  to 
within  a  specified  tolerance).  This  would  be 
the  case  if  the  results  were  driven  by  what 
turned  out  to  be  a  false  assumption,  or  if  the 
accuracy  were  not  enough  for  the  decision  at 
hand. 

The  accreditation  effort  has  to  be 
tailored  to  the  model  application.  For  ex¬ 
ample,  in  the  evaluation  of  a  new  weapon 
system,  the  accreditation  effort  would  have 
to  look  closely  at  how  new  technologies  are 
treated  in  the  model,  the  impact  on  the 
model’s  decision  rules  given  new  tactics  that 
are  feasible  with  the  new  system,  and  chang¬ 
es  in  the  existing  modeled  interactions  be¬ 
tween  new  and  existing  systems.  What  must 
be  communicated  to  the  decision  maker  are 
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the  strengths  and  weaknesses  of  the  applica¬ 
tion  of  the  model  to  a  specific  study;  its 
limitations  with  regard  to  that  application, 
and  how  those  limitations  affect  the  decision 
maker’s  risk  of  making  a  bad  decision. 

5.2.1  Start  Early 

Identify  the  specific  purpose  for 
which  the  nuxiel  will  he  used.  What  is  the 
question/decision?  Keep  in  mind  the  story 
of  the  seven-year  old  who  asked  his  mother 
"Where  did  I  come  from?"  The  mother  had 
expected  at  least  a  few  more  years  before 
addressing  such  questions,  but,  wanting  to 
encourage  openness  and  not  stifle  trust,  she 
bravely  explained  all  she  could.  After 
considerable  time,  the  young  boy  interrupt¬ 
ed,  "Yes,  but  Bobby  says  he’s  from  Spo¬ 
kane..."  It  is  hard  to  give  a  meaningful 
answer  if  you  don’t  know  what  the  question 
really  is.  The  most  direct  route  is  to  find 
out  f^rom  the  one  who  must  make  the  deci¬ 
sion. 

Identify  the  accrediting  authority. 
From  that  person  the  researcher  can  find 
what  are  the  variables  of  interest,  what  will 
be  accredited,  and  what  role  the  researcher’s 
efforts  will  have  in  the  decision  (the  latter  is 
particularly  important.)  With  this  knowl¬ 
edge  the  researcher  can  focus:  focus  the 
model  building,  focus  the  verification  and 
validation  efforts,  focus  the  sensitivity  analy¬ 
ses,  and  focus  the  research  on  the  real 
question  and  requirement  for  meaningful 
infonnation. 

5.2.2  Have  a  Proces.s  and  a  Plan 

The  plan  should  identify  the  issues 
and  scope  of  the  effort.  The  researcher 
should  insist  that  the  plan  be  reviewed  and 
approved  in  the  same  chain  that  the  accredi¬ 
tation  will  follow.  The  plan  becomes  a  dry 
run  of  the  process  that  will  be  followed  after 


the  results  are  in  and  accreditation  is  sought. 
This  is  a  way  of  establishing  contact  be¬ 
tween  the  researcher  and  the  decision  mak¬ 
er.  There  are,  in  many  areas  of  model 
application,  places  where  an  accreditation 
plan  can  be  described  or  referenced.  For 
models  to  be  used  in  support  of  operational 
test  and  evaluation,  the  COEA  Guidance 
Memo  and  the  TEMP  are  two  appropriate 
places  to  seek  agreement  on  accreditation 
issues. 

The  accreditation  plan  should  include 
at  least  tentative  criteria  from  the  accrediting 
authority. 

5.2.3  Have  a  Team 

The  researcher’s  efforts  should  make 
it  easier  for  the  decision  maker  (and  the 
decision  maker’s  advisors)  to  accept  the 
results  with  knowledge  of  the  strengths  and 
weaknesses  of  the  model. 

One  of  the  things  that  the  researcher 
can  do  is  to  ,ask  for  a  team  to  review  the 
model.  Strong  teams  have  certain  things  in 
common.  They  are  made  up  of  experts.  To 
ensure  that  they  don’t  have  a  narrow  slant 
on  the  subject,  they  are  interdi-sciplinary. 
While  the  use  of  computer  modeling  may  be 
new,  the  problem  of  building  confidence  is 
not.  Aristotle  suggests  that  to  create  confi¬ 
dence  requires  that  the  speaker  appear  to 
possess  practical  intelligence,  moral  excel¬ 
lence  and  good  will.  Interdi.sciplinary  teams 
suggest  "practical  intelligence."  Further, 
they  don’t  have  an  axe  to  grind,  which  in 
bureaucratic  parlance  is  often  referred  to  as 
"independence."  (They  have  "good  will.") 

Have  someone  from  the  accreditor’s 
office  on  the  team.  Or  establish  a  dialogue. 
Make  progress  reports... The  team  may  be 
necessary  because  the  model  can  have  sever- 
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al  users  (decision  makers)  at  the  same  time. 

5.2.4  Have  an  Approach  or  Method 

A  model  is  a  simplification  of  reality. 
It  is  used  to  represent  some  portion  of  reali¬ 
ty.  The  first  challenge  that  the  researcher 
should  address  is  whether  the  model  is 
helping  the  decision  maker  to  understand 
better.  (The  worst  thing  that  could  happen 
is  for  the  researcher  to  lake  the  view  that  the 
decision  maker  cannot  or  will  not  take  the 
time  to  understand  the  results,  how  they 
were  obtained,  and  why  they  are  believ¬ 
able.)  The  purpose  of  the  model  is  to  im¬ 
prove  understanding.  The  "answer"  may  be 
secondary.  When  this  is  the  case,  it  is  most 
important  to  explain  the  qualitative  nature  of 
the  model  output,  and  not  to  allow  the 
decision  maker  to  put  undue  confidence  in 
the  numerical  outcome.  It  was  the  apprecia¬ 
tion  of  this  aspect  that  lead  the  Defense 
Science  Board  to  warn,  "Do  not  use  models 
and  simulations  to  prove  things." 

5..^  COMMUNICATION 

The  operations  researcher  has  the 
following  responsibility  in  communicating 
the  results  of  the  model:  he  must  bring  to 
the  surface  (for  the  decision  maker  or 
accreditor  to  see)  those  aspects  of  the 
model’s  employment  and  use  that  are  most 
important  in  obtaining  the  results  presented. 
The  fact  of  the  matter  is  that  many  complex 
models  obscure  seeing  these  "drivers." 
Exposing  them  is  sometimes  a  difficult  task. 
Without  exposing  them,  the  operations 
researcher’s  task  is  not  complete.  Good 
analysis  requires  such  an  effort.  In  the  ideal 
case  the  results  of  the  model  can  be  derived 
on  the  back  of  an  envelope  (meaning  that 
the  result  can  be  derived  and  explained  to 
some  approximation  without  the  computer 
black  box).  If  the  researcher  can,  with  the 
use  of  such  an  envelope,  help  the  decision 


maker  to  understand  what  is  "driving"  the 
results,  the  work  of  transferring  confidence 
to  the  model  is  very  far  along.  The  decision 
maker  in  general  will  have  little  trouble 
letting  the  computer  generate  the  "next 
significant  figure."  Actually,  the  process  of 
abstracting  from  the  model  the  truly  salient 
features,  the  driving  factors,  and  the  critical 
assumptions  is  what  operations  research 
should  be  about.  If  the  real  driver  is  the 
input  that  system  x  has  an  acquisition  range 
twice  system  y,  that  should  be  explicit,  not 
hidden  in  the  million  lines  of  code.  Expos¬ 
ing  why  the  model  gives  the  result  is  critical 
to  the  decision  to  use  or  avoid  use  of  the 
model. 

In  making  the  case  for  computer 
literacy,  John  G.  Kemeny  has  noted:' 

Unfortunately,  most  decision 
makers  in  government  and 
industry  today  are  computer- 
illiterates.  Although  comput¬ 
er  systems  are  in  place  in 
most  large  organizations, 
they  perform  mostly  routine 
book-keeping  functions  and 
are  used  little,  if  at  ail,  in 
decision-making.  High-level 
executives,  too  embarrassed 
to  expose  their  ignorance  of 
computers  by  asking  ques¬ 
tions  of  the  computer  center, 
often  leave  important  corpo¬ 
rate  decisions  by  default  to 
computer  programmers,  who 
must  fill  in  the  gaps  in  the 
vague,  general  instructions 
they  receive  from  top  man¬ 
agement. 

The  operations  researcher  should 
avoid  "filling  in  the  gaps"  without  educating 
the  decision  maker  on  how  good  the  fill  is. 
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Accreditation  is  a  good  place  to  start,  as  has 
been  noted.  There  is  no  single  method. 
There  are  some  models  of  accreditation,  but 
none  is  accredited.  Bach  model  has 
strengths  and  weaknesses  which  will  be 
discussed.  The  accreditation  plan  should 
develop  from  the  interaction  of  the  research¬ 
er  and  the  decision  maker. 

5.3.1  The  Legal  Analogy 

Often  models  are  used  as  pait  of  a 
contentious,  adversarial  process.  The  model 
is  used  as  part  of  the  evidence  for  or  against 
some  particular  point.  This  is  generally 
unfortunate  because,  usually,  the  researcher 
who  knows  the  work  is  generally  not  present 
to  explain  what  was  actually  done  with  the 
model  and  what  are  the  appropriate  conclu¬ 
sions  from  the  model  runs. 

If  the  expectation  of  the  researcher  is 
that  such  will  be  the  case,  one  possibility  is 
to  encourage  the  debate  in  its  proper  forum, 
namely  the  research  community.  In  such  a 
case,  for  example,  a  red  team  could  be 
formed  to  find  and  document  the  weaknesses 
of  the  model  and  its  use.  This  would  be 
submitted  with  the  results  of  the  run  as  part 
of  the  accreditation. 

As  with  our  legal  system,  we  make 
the  best  case  for  and  against  the  particular 
modeling  or  simulation  application  and  pass 
judgment  on  whether  or  not  to  use  it.  Key 
to  the  legal  analog  are  the  following; 

•  present  both  sides  fairly. 

•  have  some  ground  rules  for 

what  is  relevant,  e.g.,  the 

MORS  SIMVAL  areas  for 

consideration. 

•  do  not  suppress  evidence; 
bring  diverse  views  to  the 
forum,  expert  witness¬ 


es,proponents,  adversaries 
(the  formation  of  Red 
Teams),  whatever  makes 
sense. 

5.3.2  The  Moral  Analogy 

While  the  legal  analog  may  be  neces¬ 
sary  in  particularly  contentious  cases,  it  does 
imply  that  there  are  two  adversaries  with 
positions  to  defend.  So  long  as  the  debate 
focuses  on  the  applicability  of  the  model  to 
the  problem  at  lund,  the  debate  can  be 
helpful.  Once  tlie  transition  is  made  to  the 
results,  the  analyst  should  be  careful.  "But 
of  all  our  sins,  the  one  that  will  finally  hurt 
the  profession  the  worst  is  the  blurring  of 
‘analysis’  on  the  one  hand'and  ’position¬ 
taking’  on  the  other."*  Protecting  the  pro¬ 
fession  is  a  noble  goal;  the  analyst  should 
not  be  a  hired  gun:  "Have  Model  Will 
Travel."  Self  interest  should  also  have  a 
role.  If  the  profession  becqmes  litigious 
beyond  reasonable  limits,  we  will  begin  to 
share  other  attributes  with  lawyers.  Deci¬ 
sion-makers  may  come  to  feel  as 
Shakespeare’s  Cade  felt  in  Henry  VI,  Part 
n,  "The  first  thing  we  do,  let’s  kill  all  the 
lawyers".  ,  . 

Establishing  trust,  rather  than  invit¬ 
ing  confrontation  is  not  a  new  problem. 
Morse  and  Kimball  note: 

The  reaching  of  a 
working  understanding  on 
"terms  of  reference"  between 
the  operations  research  work¬ 
er  and  the  administrative 
head  to  whom  he  is  assigned 
is  one  of  the  most  important 
organizational  problems 
encountered  in  entering  a 
new  field  of  operations  re¬ 
search.  Scientist  and  admin- 
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istrator  perform  different 
functions  and  often  must  take 
opposite  points  of  view.  The 
scientist  must  always  be 
skeptical,  and  is  often  impa¬ 
tient  at  arbitrary  decisions; 
the  administrator  must  even¬ 
tually  make  arbitrary  deci¬ 
sions,  and  is  often  impatient 
at  skepticism.  It  takes  a 
great  deal  of  understanding 
and  mutual  trust  for  the  two 
to  work  closely  enough  to¬ 
gether  to  realize  to  the  fullest 
the  immense  potentialities  of 
the  partnership.  {Italics 
added.)’ 

5.3.3  Test-Model-Test-Model 

"The  basic  cycle  of  the  scientific 
method  may  be  divided  into  three  steps: 
induction,  deduction,  and  verif.aticn. 
...Induction  is  the  step  which  carries 
scientist  from  factual  observation  to  .i’e 
fonnation  of  theories.  Once  the  theory  is 
fonnulated  precisely,  the  tools  of  logic  ?-,xi 
mathematics  are  available  to  deduce  conse¬ 
quences  from  it.  Once  a  number  of  interest¬ 
ing  consequences  have  been  deduced  frorn 
the  theory,  they  must  be  put  to  the  test  of 
experimental  verification...’''' 

Test  results  can  be  used  to  either 
further  accredit  or  di^redit  a  model  (and  by 
implication  its  results)  by  uncovering  inter¬ 
actions  and  factors  not  foreseen  in  the  mod¬ 
eling  effort.  One  of  the  greatest  dangers  in 
model  building  is  to  ignore  those  factors  that 
are  difficult  to  model  or  for  which  there  is 
scant  data  on  which  to  base  a  model.  Com¬ 
parison  with  field  results  of  any  kind  often 
provides  the  rude  awakening  that  allows 
greater  objectivity  as  to  the  limits  of  the 
model.  Iliere  can  be  a  certain  tension 


between  modelers  and  testers  that  can  be  of 
benefit  to  both.  There  is  an  old  adage  that, 
"No  one  believes  a  model,  except  the  person 
who  wrote  it,  and  everyone  believes  a  test 
result  except  the  person  who  ran  it."  The 
modeler  will  look  at  the  test  result  as  or  e 
realization  of  possible  outcomes  and  dis¬ 
count  .any  discrepancy.  This  should  not  be 
encouraged.  The  model  should  produce 
estimates  of  variability  also.  The  model 
should  produce  estimates  of  both  the  expect¬ 
ed  outcome,  and  the  variability  about  the 
expected  value.  Any  discrepancy  should  be 
examined  to  see  if  it  is  due  to  factors  ne- 
;.'.:>‘c:ed  ie.  the  model  that  contribute  to  vari¬ 
ability.  or  to  a  sample  size  that  was  too 
nv.'l  This  said,  it  should  be  clear  why  it 
.s  A’ise  for  the  tester  to  choose  the  sample 
si-:e  with  a  knowledge  of  what  the  model 
predicts  for  the  variability.  Models  that 
canrot  lie  disproved  based  on  test  results  are 
of  a  u  ility  similar  to  tests  in  which  it  is 
imixissible  to  fail:  neither  is  worth  consider¬ 
ing. 

5.3.4  Back-of-the-Envelope  Believability 

The  purpose  of  a  model  is  not  to 
duplicate  reality,  the  purpose  is  to  increase 
our  understanding  of  certain  factors  that  are 
important  in  a  problem.  When  a  model 
becomes  too  complicated  to  explain,  and  the 
origin  of  results  is  obscured  inside  a  black 
box  —  then  the  model  has  not  increased  our 
understanding,  it  has  obscured  our  igno¬ 
rance.  (Often  this  is  evident  in  an  exchange 
that  goes:  "Why  did  it  turn  out  that  way?" 
—  "That’s  just  the  v;ay  it  turns  out!")  The 
model  should  allow  traceability  of  cause  and 
effect  between  the  variables  of  interest  and 
the  outcomes. 

One  technique  or  discipline  that  can 
be  used  is  to  strip  the  model  down  to  essen- 
tirls,  find  out  what  are  the  key  assumptions 
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that  drive  the  model,  and  develop  a  simpler 
model  that  is  easier  to  understand  and  ex¬ 
plain.  In  fact,  good  research  might  begin 
with  the  simplest  models  that  are  back-of- 
the-envelope.  More  complicated  models 
develop  as  more  factors  that  are  potentially 
relevant  are  treated  more  explicitly.  These 
additional  factors  may  change  detailed  nu¬ 
merical  outputs  but  should  not  change 
overall  trends  or  conclusions,  provided  the 
original  model  was  good.  ’Vhichever  way 
the  chronology  of  the  research  occurs,  the 
result  can  be  a  hierarchy  of  explanations  that 
go  into  greater  and  greater  detail,  until  the 
results  are  understood  to  a  point  v/here  the 
decision  maker  can  understand  why  and  how 
the  results  come  about. 

Often  the  operations  researcher  is  not 
the  developer  of  the  model,  but  is  expected 
to  employ  a  model  and  use  it  to  get  an 
answer.  This  is  a  very  dangerous  situation. 


The  researcher  must  first  determine,  in  the 
researcher’s  own  mind,  the  acceptability  of 
the  model  for  the  specific  puipose.  The 
technique  described  above  works  particularly 
well  in  such  a  situation. 

5.3.5  Risk  Assessment  as  Part  of  Ac¬ 
creditation 

The  level  of  effort  applied  to  accredi¬ 
tation  is  driven  by  the  perceived  importance 
of  the  use  of  the  model.  Accreditation 
should  demand  that  an  analysis  of  potential 
"unknowns"  be  done  and  documented. 

One  standard  technique  that  the 
operations  researcher  has  is  Decision  Theo¬ 
ry.  In  such  an  approach  the  analy.st  will 
evaluate  the  consequences  of  making  a 
mistake  by  using  the  model.  The  risk  will 
depend  on  the  model  and  the  decision  to  be 
made.  It  may  depend  on  the  phase  of  the 
program. 


Table  5-1.  Example  of  Risk  Variance  In  Use  of  Models  in  the  Acquisition  Process 


I  Phase 

Area  of  Impact 

j - - ^ - K 

I  Impact  I 

I  Mission  area  analysis 

I  Further  studies 

I 

|C0EA 

Choice  of  alterna- 
I  tives 

Keep  one  or  more  alterna- 
j  tives 

System  Design  | 

I  Engineering  Analysis 

I  Choose  to  run  tests 

Test  planning  | 

I  Sample  size 

I  Cost  of  test 

Test  execution  | 

I  Shape  battle 

Test  realism 

Evaluation 

Pk  analysis 

Exit  criteria 

Milestone  III 

Procurement 

Fielding 
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Risk  assessment  could  work  as  fol¬ 
lows; 

•  First,  together  with  the  deci¬ 
sion  maker  detennine  the 
purpose  to  which  the  model 
will  be  put,  and  the  decision 
actually  to  be  made. 

•  Then  develop  a  model  (a 
meta-mixlel?)  of  how  the 
computer  model  will  be  used 
in  the  decision  process.  (For 
example,  it  might  be  to  con- 
finn  that  no  previously  used 
models  contradict  what  the 
decision  maker  wants  to  do, 
or  it  m.ght  be  to  generate  a 
single  parameter  estimate  that 
is  a  go-no  go  criterion.) 

o  As.sess  the  risk  (expected 
loss)  in  the  use  of  the  com¬ 
puter  model  by  determining 
(1)  the  probability  of  a  wrong 
answer  from  the  model  (for 
example  due  to  the  error 
bounds  of  the  inputs),  (2)  the 
effect  of  a  wrong  answer  on 
the  decision,  and  (3)  the 
ultimate  consequences  a 
wrong  answer  might  have  on 
the  program.  Some  deci¬ 
sions,  like  investment  deci¬ 
sions,  cannot  be  avoided,  but 
the  program  can  be  corrected 
if  later  events  do  not  confmn 
the  expected  outcome.  Other 
decisions  might  be 
uncheckable  at  all,  or  only 
after  huge  resources  have 
been  wasted.  In  examining 
this  aspect  of  the  question, 
the  adequacy  of  the  decision 


makers  plan  for  the  program 
needs  to  be  understood.  In 
fact,  the  plan  for  the  program 
should  have  been  constructed 
with  internal  checks  so  that 
wrong  decisions  can  be  found 
and  corrected  without  signif¬ 
icant  loss.  In  assessing  the 
risk  at  a  particular  decision 
point,  the  analyst  may  be 
able  to  use  the  planner’s 
analysis  if  it  is  available. 
During  defen.se  program 
execution,  there  are  distinct 
phases  during  which  the 
consequences  of  a  wrong 
decision  are  very  different 
(see  table). 

For  example,  the  accreditation  pro¬ 
cess  may  determine  that  the  model  does 
better  in  determining  relative  differences 
than  absolute  values.  The  model  suggests 
that  one  alternative  is  preferred.  If  the 
uncertainty  in  the  absolute  results  of  the 
model  is  so  great  that  neither  alternative 
considered  may  satisfy  the  need,  the  deci¬ 
sion  maker  should  know  that  the  mission 
need  may  go  unfulfilled  with  some  probabil¬ 
ity  if  a  single  preferred  alternative  is  chosen, 
and  a  different  probability  if  two  alternatives 
are  kept  under  development.  The  cost  of 
keeping  one  or  two  alternatives  under  devel¬ 
opment  must  also  be  considered.  The  cost 
of  an  extra  alternative  through  demonstration 
and  validation  may  be  small  compared  to 
incorrectly  choosing  a  preferred  alternative 
and  not  finding  out  until  all  the  development 
money  is  spent.  In  other  words,  knowing 
how  much  to  believe  the  results  of  the  mod¬ 
el  may  allow  ihe  decision  maker  to  hedge 
the  risks.  It  may  encourage  exploration  of 
alternatives  with  less  risk.  It  may  encourage 
a  program  modification  to  gather  the  kind  of 
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data  so  that  a  better  model  could  be  devel¬ 
oped.  It  may  encourage  the  decision  maker 
to  insert  test  points  into  the  program  in 
order  to  gain  more  confidence  that  the 
program  will  eventually  pay  off. 

As  a  second  example  consider  that 
using  a  model  to  help  plan  tests  can  be  very 
useful,  and  not  incur  great  risk.  For  exam¬ 
ple,  a  model  can  be  used  to  estimate  the 
variability  of  test  outcomes  in  order  to  help 
determine  an  appropriate  sample  size.  If  the 
estimated  variability  is  wrong,  the  confi¬ 
dence  level  of  the  test  may  be  changed,  and 
if  the  variability  is  very  wrong,  the  test  may 


have  to  be  extended.  But  the  test  results 
will  still  be  available.  The  knowledge  that 
there  were  sources  of  variability  that  were 
not  accounted  for  in  the  model  will  probably 
stimulate  an  improvement  to  the  model,  and 
warn  the  decision  maker  about  the  model. 

5.3.6  Documentation 

The  emphasis  on  accreditation  today 
means  that  the  operations  researcher  should 
document  the  evidence,  the  review  process, 
and  the  thought  process.  The  table  below 
suggests  what  the  documentation  should  in¬ 
clude. 


Table  5-2.  Considerations  for  Accreditation  Documentation 


Evidence 

What  is  the  evidence  that  the  model  is  appropriate  for 
the  problem  at  hand? 

Are  the  input  data  to  the  model  relevant?  What  is  the 
data  base  on  which  the  model  is  built? 

Is  this  data  base  relevant? 

Criteria 

What  criteria  were  used  to  decide  on  the  appropriate¬ 
ness  of  the  model? 

Process 

Who  is  the  '-^cision  maker  or  accreditor? 

Decision 

What  •  ^  ihe  decision  maker  need  in  order  to  under¬ 

stand  1  '  purpose  of  this  accreditation? 

Caveats 

What  warnings  need  to  be  clearly  stated? 
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5.4  SUMMARY 

Reduced  to  its  essentials  the  VV&A 
problem  for  the  operations  researcher  is  the 
following: 

•  All  models  are  wrong  (at 
some  level  and  in  some  way). 

•  Validation  determines  how 
the  model  is  wrong  and  when 
(i.e.,  it  detennines  the  limits 
and  errors  in  the  model.) 

•  Accreditation  is  the  detemu- 
nation  that  the  decision  to  be 
made  is  not  sensitive  to  those 
errors  and  limitations. 


The  approach  to  accreditation  outlined  here 
is  not  institutionalized  in  directives  or  plans. 
What  is  suggested  is  that,  as  a  matter  of 
professional  practice,  an  operations  re¬ 
searcher  with  a  modeling  problem  should 
actively  .se^k  to  install  an  accreditation 
framework  within  the  project.  This  will 
allow  the  development  of  a  focused  and 
mutually  beneficial  interaction  between  the 
researcher  and  the  decision  maker.  The  role 
of  any  model  is  to  increase  understanding 
and  facilitate  communications.  The  model 
is  a  tool  of,  not  a  substitute  for,  good  Judg¬ 
ment. 


Endnotes 

1  E.  S.  Quade,  Analysis  for  Puhlk  Dedsinns,  Third  Edition  Revised  by  Grace  M.  Carter,  1989,  North- 
Holland,  New  York,  p.  169. 

2  Ibid.,  p.  332. 

3  John  G.  Kemeny,  "The  Case  for  Computer  Literacy,"  Daedalus,  Spring  1983,  p.  218. 

4  Glenn  Kent,  "The  Role  of  Analysis  in  Decision  Making,"  Keynote  speech  before  the  24th  meeting  of  the 
Military  Operations  Research  Society,  1969. 

5  Methods  of  Operations  Research,  Philip  M.  Morse  and  George  E.  Kimball  1946,  Operations  Evaluation 
Group,  Office  of  the  Chief  of  Naval  Operations,  Washington,  D.C.,  pp.  8-9. 

6  Mathematical  Models  in  the  Social  Sciences,  John  G.  Kemeny  and  J.  Laurie  Snell  1%2.  Ginn  and  Co., 
Boston,  p.3. 


V-12 


CHAPTER  VI  -  A  FRAMEWORK  FOR  VERIFICATION, 
VALIDATION,  AND  ACCREDITATION' 

Paul  K.  Davis 


6.0.  PREFACE 

This  study  was  developed  for  the 
Defense  Modeling  and  Simulation  Office 
(DMSO),  which  is  under  the  Director, 
Defense  Research  and  Engineering.  The 
study  reflects  discussions  of  the  DMSO’s 
Applications  and  Methodology  Working 
Group,  which  I  chaired  during  this  woric. 
The  study  also  draws  upon  discussions  at 
two  special  meetings  on  verification,  valida¬ 
tion,  and  accreditation  (VV&A)  sponsored 
by  the  Military  Operations  Research  Society 
(MORS)  on  October  15-18,  1990  and  March 
31 'April  2,  1992.  Nonetheless,  the  material 
presented  here  is  my  responsibility  and  I 
make  no  claims  about  consensus  in  the 
community.  VV&A  is  a  difficult  subject  on 
which  there  is  a  broad  range  of  opinions  and 
practices  (e.g.,  VV&A  of  software  used  in 
space  probes  is  different  from  VV&A  of 
military  simulations  used  for  analysis).  At 
the  same  time,  it  appears  that  a  considerable 
convergence  of  view  is  taking  place  and  I 
hope  th^t  this  study  will  accelerate  that 
process.  Comments  and  suggestions  are 
therefore  especially  welcome.  They  can  be 
sent  by  electronic  mail  to 
Paul_Davis(@rand.org  through  Inter  Net. 

Work  on  this  effort  was  accom¬ 
plished  in  the  Applied  Science  and  Technol¬ 
ogy  program  of  RAND’s  National  Defense 
Research  Institute,  a  federally  funded  re¬ 
search  and  development  center  sponsored  by 
the  Office  of  the  Secretary  of  Defense  and 
the  Joint  Staff. 


6.1.  INTRODUCTION 
6.1.1  Objectives 

Verification,  validation,  and  accredi¬ 
tation  (VV&A)  is  a  complex  subject  that  has 
troubled  developers  and  users  of  models  for 
many  years.  Each  generation  of  modelers 
and  analysis  must  think  it  through,  because 
understanding  the  issues  is  important  to 
professionalism.  Consumers  of  analyses 
exploiting  mod.els  must  also  understand  the 
subject  or  they  will  have  difficulty  Judging 
the  quality  of  {Products.  Further,  they  may 
either  be  insufficiently  demanding  or  sup¬ 
portive  of  VV&A  efforts  on  one  extreme,  or 
unreasonable  on  the  other— requiring  a 
degree  of  validation  that  is  impossible  even 
in  principle.  Managers  of  analysis  organiza¬ 
tions  should  understand  VV&A  so  that  they 
can  put  into  place  appropriate  procedures, 
standards,  and  incentives.  This  may  be 
called  a  W&A  “regime”  to  emphasize  that 
VV&A  is  not  a  one-time  event,  but  rather 
an  ongoing  but  episodic  organizational 
activity  that  should  be  understood  and  con¬ 
sidered  important  by  all  participants. 

What,  then,  might  a  VV&A  regime 
look  like  if  one  saw  it?  What  advice  should 
be  given  to  a  new  manager  who  is  ready  and 
willing  to  institute  reforms  to  establish 
sound  VV&A  policies  and  procedures?  This 
study  is  an  effort  to  sketch  the  essential 
features  of  an  answer.  Its  principal  objec¬ 
tive  is  to  provide  guidance  that  would  be 
useful  to  such  a  manager  in  government, 
industry,  or  the  academic  world.  Auxiliary 
objectives  include  discussing  the  special 
VV&A  problems  associated  with  knowledge- 
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based  models  and  recommending  new  atti¬ 
tudes  about  model  development  and  VV&A 
that  reflect  implications  of  modem  technolo¬ 
gy- 

6.1.2  Background 

There  is  a  considerable  literature  on 
VV«fcA  for  military  models,  much  of  it 
severely  critical  of  model  developers  and 
their  government  sponsors  for  there  not 
having  been  enough  VV&A  in  the  past.^ 
There  is  no  definitive  source  on  what 
VV&A  is  or  should  be,  but  someone  new  to 
the  field  might  well  consult  Thomas  (1983), 
other  chapters  of  Hughes  (1989),  Gass 
(1982),  Sargent  (1987),  and  Martin 
Marrietta  (1990).^  The  first  of  these  has  a 
philosophical  slant  and  addresses  some  of 
the  profound  difficulties  in  even  contemplat¬ 
ing  model  evaluation.  The  latter,  which 
draws  on  the  work  of  Gass,  Sargent,  and 
others,  describes  an  approach  that  has  been 
used  in  large-scale  efforts  having  to  pass 
rather  stringent  DoD  criteria.  Another  good 
introduction  to  validation  issues  is  Miser  and 
Quade  (1988).  Finally,  those  concerned 
with  VV&A  will  surely  want  to  examine 
guideline  documents  emerging  from  sponsor¬ 
ing  organizations,  as  well  as  regulatory 
documents  such  as  U.S.  Army  (1992)  (espe¬ 
cially  Chapter  6  on  VV&A)  and  DoD-MIL- 
STD  2167,  which  describes  software  stan¬ 
dards. 

In  this  study  I  present  some  definitions 
(Section  2)  along  with  discussion  of  what  the 
definitions  mean  and  why  they  are  not  sim¬ 
pler.  My  definitions  of  validation  and 
accreditation  extend  the  more  usual  ones  in 
important  ways.  Section  2  then  presents  a 
taxonomy  of  VV&A  methods,  focusing 
primarily  on  validation.  Section  3  describes 
VV&A  as  a  dynamic  process  that  should 
conduct  evaluations  for  both  broad  classes  of 


model  application  and  for  specific  studies 
having  detailed  analytic  plans.  Section  4 
then  pulls  things  together  and  recommends 
an  approach  for  the  use  of  practitioners, 
managers,  and  consumers  of  model-based 
analysis. 

6.2.  DEFINITIONS  AND  CONCEPTS 
6.2.1  Models  and  Programs 

“Models”  are  representations  of 
certain  aspects  of  reality  (e.g.,  of  certain 
aspects  of  particular  systems).  They  come 
in  many  forms,  including  the  physical  scale 
models  used  by  architects,  analytical  models 
expressed  in  paper-and-pencil  equations,  and 
computer  models  (see  also  the  overview 
chapter  of  Hughes,  1989).  This  study  fo¬ 
cuses  on  computerized  models,  primarily 
“simulation  models,”  which  attempt  to 
describe  how  a  system  changes  (behaves) 
over  time.*  I  am  also  concerned  here  with 
models  having  phenomenological  content 
relating  causes  and  effects  rather  than,  say, 
regression  “models”  or  optimizing  algo¬ 
rithms  that  some  might  call  models. 

Although  the  terms  “model,”  “simu¬ 
lation,”  and  “program”  are  often  used  inter¬ 
changeably,  here  and  elsewhere,  it  is  some¬ 
times  important  to  make  distinctions,  espe¬ 
cially  between  the  model  (or  what  some  call 
the  conceptual  model)  and  the  program  (or 
computer  cooe),  which  implements  the 
model.  Annex  A  elaborates  on  this  and  ar¬ 
gues,  reluctantly  and  in  contradiction  with 
the  advice  given  by  most  scholars,  that  it  is 
becoming  increasingly  difficult— and  de- 
creasingly  appropriate— to  separate  the 
processes  of  designing  and  evaluating  mod¬ 
els  on  the  one  hand,  and  designing,  build¬ 
ing,  and  evaluating  program  implementa¬ 
tions  on  the  other.  Technological  change 
demands  a  new  approach  here. 
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6.2.3  Verification 


6.2.2  Models,  Data,  and  Knowledge 
Bases 

Throughout  this  study  “model” 
means  the  union  of  a  “bare  model”  (also 
referred  to  as  “the  model  itself’)  and  its 
“data  base.”  Thus,  Y(t)  =  Y(0)  -  1/2  g  tMs 
a  bare  model,  while  {g  =  32  ft/sec^  Y(0) 
=  10,000  ft}  is  a  data  base.  In  some  in¬ 
stances,  the  data  is  a  “knowledge  base”  in 
the  form  of  rules  and  algorithms. 

In  the  past,  bare  models  were  con¬ 
ceptually  distinct  from  data  in  most  cases. 
The  bare  models  defined  structure  and  algo¬ 
rithms,  while  the  data  base  provided  values 
(e.g.,  for  the  gravitational  constant  or  the 
number  of  tanks  in  a  division).  Modem 
practice,  however,  has  muddied  the  distinc¬ 
tions.  In  many  models,  much  of  the  sub¬ 
stantive  content  is  defined  in  the  data  base 
because  with  most  computer  models  it  is 
easier  and  faster  to  change  data  than  the 
program  itself  and  developers  have  sought  to 
provide  users  as  much  flexibility  as  possi¬ 
ble.^  As  a  result,  the  W&A  process  must 
consider  both  bare  models  and  data  bases.  ® 

Quite  often,  bare  models  and  data 
bases  need  to  be  reviewed  together,  in  the 
context  of  an  application;  in  other  cases 
(i.e.,  with  different  model  designs),  they  can 
to  greater  or  lesser  degree  be  reviewed  sepa¬ 
rately.  For  example,  one  can  conduct 
VV&A  on  an  order-of-battle  data  base 
without  knowing  precisely  how  that  data 
base  will  be  used.  Similarly,  one  can  con¬ 
duct  VV&A  on  an  algorithm  without  know¬ 
ing  the  precise  context  in  which  it  will  be 
used. 


Verification  is  the  process  of  deter¬ 
mining  that  a  model  implementation 
(i.e.,  a  program)  accurately  repre¬ 
sents  the  developer's  conceptual  de¬ 
scription  and  specifications. 


This  is  the  definition  commonly  ac¬ 
cepted  in  the  military  modeling  community. 
There  continues,  however,  to  be  some  con¬ 
fusion  and  disagreement  about  precisely 
what  is  and  is  not  covered  under  verifica¬ 
tion,  and  about  what  taxonomy  describes 
verification  activities.  I  consider  verifica¬ 
tion  to  consist  of  two  basic  parts. 

•  Logical  and  mathematical  verifica¬ 
tion  ensures  that  the  basic  algorithms 
and  rules  are  as  intended  by  the 
designer  and  do  not  include  logical 
or  mathematical  errors  (e  g.,  divi¬ 
sions  by  zero,  incompletely  specified 
logic,  or  nonsense  results  when 
certain  variables  take  extreme  or 
unusual  values).  Although  verifica¬ 
tion  is  nominally  concerned  with 
implementation  rather  than  correct¬ 
ness  of  design,  it  is  common  for 
verification  activities  to  uncover 
design  errors  along  the  way  (e.g.,  to 
detect  an  implicit  and  unreasonable 
assumption  about  independence  of 
events).  Thus,  verification  activities 
should  begin  with  documentation  and 
will  often  accomplish  some  valida¬ 
tion  functions. 

•  Program  verification  (or  code  verifi¬ 
cation)  ensures  that  these  representa¬ 
tions  have  been  correctly  implement¬ 
ed  in  the  computer  program.  Pro¬ 
gram  verification  is  concerned  in 
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part  with  simple  matters  such  as 
discovering  and  correcting  typo¬ 
graphical  errors,  errors  in  the  units 
in  which  physical  quantities  are 
described,  and  errors  of  definition 
(e.g.,  a  model  designer  might  have 
intended  that  a  force  ratio  apply  only 
to  forces  on  the  forward  line  of 
troops  (PLOT),  but  the  programmer 
might  have  defined  it  to  apply  to 
groupings  that  include  corps-level  re¬ 
serves).  It  is  also  concerned  with 
more  complex  issues  such  as  the 
appropriateness  of  numerical  integra¬ 
tion  techniques,^  covering  all  the 
logical  cases  (including  cases  that  the 
designer  might  consider  unlikely  or 
unphysical),  and  eliminating  bugs 
that  would  cause  the  program  to 
“crash"  in  some  circumstances. 
Many  such  bugs  involve  intricacies 
of  the  particulai  computer  hardware, 
operating  system,  and  interface  soft¬ 
ware. 

Verification  is  a  matter  of  degree  for 
complex  models,  because  it  is  impossible  in 
practice  to  test  the  model  over  the  entire 
range  of  variable  values  and  because  it  is 
often  not  fea  ibie  with  available  resources  to 
do  a  line-i;  line  code  check.  Thus,  a  model 
may  be  >/ell  verified  within  a  particular 
“scenario  space,"  but  not  well  verified  oth¬ 
erwise.'  In  principle,  one  might  think  of 
using  sampling  techniques  to  verify  code  to 
some  level  of  confidence,  but  I  am  personal¬ 
ly  unaware  of  any  rigorous  efforts  to  do  so 
in  the  realm  of  combat  modeling. 

Verification  of  data  (especially 
classical  types  of  data  such  as  physical  con¬ 
stants  or  orders  of  battle  rather  than,  say, 
data  defining  elements  of  model  structure  or 
exponents  in  algorithms)  should  often  be 


distinguished  from  verification  of  the  bare 
model,  because  different  techniques  are  in¬ 
volved  and  data  bases  change  frequently.'’ 
There  are  at  least  two  aspects  of  data  verifi¬ 
cation.  The  first  aspect  involves  ensuring 
that  source  data  are  converted  properly  to 
model  input  data  and  are  consistent  with  the 
model  concept  and  logical  design  (e.g.,  that 
data  supposed  to  represent  conditional  prob¬ 
abilities  of  kill  given  a  hit  do  indeed  repre¬ 
sent  those  rather  than,  say,  kill  probabilities 
per  shot).  It  should  also  include  spot  checks 
to  confirm  that  data  were,  in  fact,  extracted 
from  the  stated  source  and  that  it  represents 
the  latest  available  from  that  source.  If  data 
is  not  provided  with  the  model,  then  verifi¬ 
cation  should  include  establishing  that  the 
required  user  inputs  are  readily  available. 

A  different  aspect  of  data  verification 
applies  within  the  context  of  a  study  if  the 
data  base  has  already  been  installed.  Here 
one  seeks  to  establish  whether  the  data  base 
represents  correctly  the  assumptions  intend¬ 
ed  for  the  analysis.  For  example,  if  an 
analyst  states  that  he  wants  to  use  a  particu¬ 
lar  official  data  base  for  orders  of  battle, 
data-base  verification  would  include  check¬ 
ing  that  the  desired  data  base  was  the  start¬ 
ing  point  for  the  installed  data  base,  but  it 
would  also  check  to  see  if  appropriate  cor¬ 
rections  had  been  made— corrections  that  the 
analyst  would  surely  want  if  merely  he  knew 
to  ask  for  them.  These  would  include  pro¬ 
viding  realistic  data  values  where  the  origi¬ 
nal  data  base  had  zeros,  blanks,  or  values 
annotated  as  “purely  nominal."  Official 
data  bases  are  often  riddled  with  holes  and 
errors.  Managers  of  analysis  and  recipients 
of  analysis  are  often  unaware  of  how  serious 
these  holes  and  errors  are,  or  of  how  much 
the  analysis  depends  on  the  cleaning-up 
process,  which  often  requires  substantive 
work  and  numerous  subjective  Judgments 
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(which  unavoidably  mixes  verification  and 
validation  activities).'® 

6.2.4.  Validation 


Validation  is  the  process  of  deter¬ 
mining:  (a)  the  manner  in  which  and 
degree  to  which  a  model  (and  its 
data)  is  an  accurate  representation  of 
the  real  world  from  the  perspective 
of  the  intended  uses  of  the  model 
and  (b)  the  subjective  confidence 
that  should  be  placed  on  this  assess¬ 
ment. 


This  definition  extends  the  more 
conventional  definition."  The  extension 
calls  attention  to  two  considerations.  First, 
there  are  different  meanings  to  “accurate 
representation.”  Second,  the  validation 
process  should  address  the  issue  of  confi¬ 
dence  (not  in  the  sense  of  “statistical  confi¬ 
dence,  ”  but  in  the  larger  sense  having  to  do 
with  how  much  one  would  bet  on  the  cor¬ 
rectness  of  the  model’s  predictions  given 
residual  uncertainties).  While  one  could 
consider  both  considerations  to  be  implicit  in 
the  more  usual  definition,  it  seems  to  me 
evident  from  experience  that  they  will  be 
underappreciated  unless  made  explicit. 

Types  of  Validity 

To  elaboivte  on  the  definition  given 
above  for  “validation,”  I  use  the  phrase 
“manner  in  which”  because  a  model  can  be 
“valid”  in  several  distinct  ways.  It  may 
have  (a)  descriptive  validity,  (b)  structural 
validity,  and/or  (c)  predictive  validity  (see 
also  Zeigler,  1984).'^ 

Descriptive  validity  means  here  that 
the  model  is  able  to  explain  phenomena  or 
organize  information  meaningfully  in  one 


way  or  another.  For  example,  a  descriptive 
model  might  be  able  to  say,  “Well,  the 
reason  this  happened  is  that  A  collided  with 
B,  which  happened  because  A  had  lost  its 
radar  and  therefore  failed  to  see  B  in  the 
cloud  bank.”  All  of  this  might  be  a  sound 
and  nontrivial  reconstruction  of  events. 
Note  that  the  model  used  for  such  a  recon¬ 
struction  might  not  have  been  able  to  predict 
the  events  ahead  of  time,  especially  if  the 
key  causative  events  were  stochastic  or  some 
key  inputs  such  as  precise  speed  histories 
were  unknown.  What  constitutes  a  “good” 
description  or  explanation  depends  on  con¬ 
text  and  taste. 

Structural  validity  means  that  the 
model  has  the  appropriate  entities  (objects), 
attributes  (variables),  and  processes  so  that 
it  corresponds  in  that  sense  to  the  real  world 
(verisimilitude),  at  least  as  viewed  at  a 
particular  level  of  resolution.'^  One  may 
also  require,  for  structural  validity,  that  the 
principal  algorithms  are  at  least  roughly 
appropriate,  although  not  necessarily  accu¬ 
rate  (e.g.,  whether  a  process  describes 
exponential  or  linear  growth  may  be  regard¬ 
ed  as  a  structural  issue). 

Predictive  validity  means  that  a 
model  (including  available  or  potentially 
available  data)  can  predict  desired  features 
of  system  behavior,  at  least  for  particular 
domains  of  the  initial  conditions  and  dura¬ 
tions  of  time,  to  within  some  known  level  of 
accuracy  and  precision.  A  conditionally 
predictive  model  explicitly  identifies  alterna¬ 
tive  behaviors  and  the  conditions  that  would 
cause  them  (e.g.,  “/fthe  weather  tomorrow 
remains  clear,  then  the  air  operation  should 
go  well  and...  .” 

These  types  of  validity  can  be  con¬ 
sidered  more  or  less  orthogonal  attributes  of 
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a  model.  As  suggested  in  Table  VI-1,  one  because  the  criteria  one  applies  depends 
can  have  models  with  every  combination  of  strongly  on  the  type  of  validity  sought, 
type  validity.  This  is  significant  to  VV&A, 


Table  VI- 1  Models  With  Different  Combinations  of  Validity  Type 


Case 

Descriptive 

Validity 

Structural 

Validity 

Predictive 

Validity 

Example 

1 

Yes 

Yes 

Yes 

Well-tested  weapons-performance 
models. 

2 

Yes 

Yes 

No 

Good-theater  level  models  (which 
may,  however,  be  conditionally  pre¬ 
dictive  for  some  features  of  a  cam¬ 
paign,  at  least  in  certain  domains 
such  as  Vi/hen  one  side  has  over¬ 
whelming  force). 

3 

Yes 

No 

♦ 

No 

Historically  based  statistical  models 
correlating  different  measures  of 
outcome  (e.g.,  movement  rate  and 
ratio  of  loss  rates;  one  might  say 
"Because  the  ratio  of  loss  rates  was 
low,  the  advance  rate  was  fast." 

■ 

Yes 

No 

Yes 

Some  highly  aggregated  models  that 
reflect  doctrine  and  experience  (e.g., 
march  times  for  unopposed  moves) 

5 

No 

Yes 

Yes 

Incomprehensible  but  reliable  black¬ 
box  models  with  high  resolution  in 
entities  and  processes  (e.g.,  poorly 
coded  models  with  little  documenta¬ 
tion  or  explanation  capability). 

6 

No 

Yes 

No 

Models  with  high  resolution  in  enti¬ 
ties  and  processes,  but  poor  algo¬ 
rithms  (e.g.,  weapon-on-weapon 
attrition  calculations  assuming  per¬ 
fect  tactical  command  and  control). 

1 

No 

No 

Yes 

Rules-of-thumb  models  or  statistical 
models  that  work  for  no  clear  reason 
and  do  not  represent  system  struc¬ 
ture  (e.g.,  a  regression  model  pre¬ 
dicting  the  next  week's  weather  as  a 
function  of  today's  weather  and  the 

month  of  year). 

1  ^ 

No 

No 

No 

Bad  models. 
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To  illustrate  a  few  points  in  Table 
VI-1,  consider  first  that  a  model  can  be 
excellent,  even  definitive,  for  explaining 
phenomena  (he  fact,  and  yet  be  useless 
for  prediction  (e.g..  Case  2),  at  least  in  the 
usual  sense.  This  happens  if  the  model 
depends  on  the  values  of  variables  that  are 
unknown  until  after  the  fact  (e.g.,  the  fight¬ 
ing  quality  of  the  other  sides’  forces).  This 
situation  occurs  commonly  with  military 
models,  since  we  do  not  know  the  detailed 
initial  conditions  for  future  military  opera¬ 
tions.  No.'  do  we  know  the  various  deci¬ 
sions  that  will  be  made  in  the  course  of  the 
operations.  After  the  fact,  these  decisions 
and  other  previously  unknowable  variables 
may  be  unambiguous  and  objective  data 
(e.g.,  as  reflected  in  operations  orders  and 
reports  on  what  the  weather  was).  If  the 
model  then  explains  the  phenomena  well  in 
retrospect  (sensibly  as  well  as  accurately), 
the  model  is  descriptive.  '* 

As  a  second  example,  structural 
validity  does  not  imply  that  the  attribute 
values  are  correct  or  that  the  algorithms 
constituting  the  model  processes  are  precise. 
A  model  of  combat  might  be  structurally 
valid  while  treating  attrition  quite  approxi¬ 
mately;  it  would  have  an  attrition  process, 
but  the  process  would  be  inaccurate  (Case 
6). 

The  most  subtle  example  here  is 
probably  that  predictive  validity  does  not 
imply  descriptive  validity,  in  our  sense. 
One  can  have  an  empirically-based  model, 
perhaps  in  statistical  form,  which  is  remark¬ 
ably  predictive,  but  which  says  little  or 
nothing  about  the  cause-effect  relationships 
at  the  levels  of  physical  entities  and  process¬ 
es  (Case  7).  It  is  often  difficult  to  know 
when  such  models  will  fail,  but  they  are 
useful  nonetheless.”  ” 


Again,  then,  the  point  here  is  that 
evaluation  of  models  should  vary  with  type. 
It  is  silly  to  denigrate  a  good  descriptive 
model  that  is  structurally  valid,'  merely 
because  it  is  not  a  prediction  machine  (given 
the  data  known  ahead  of  time).  This  is 
nontrivial,  because  many  critics  of  military 
modeling  are  guilty  of  precisely  this  error. 
Those  who  argue  that  attrition  estimates  for 
the  Desert  Storm  operation  were  off  by  an 
order  of  magnitude  overlook  the  fact  that 
many  analysts  were  explicit  about  their  esti¬ 
mates  being  upper  bounds  and  about  the 
potential  for  much  lower  attrition  if  the 
Iraqis  proved  ineffective  by  virtue  of  poor 
morale,  training,  leadership  and  so  on. 

Issues  of  Degree  and  Confidence 

The  words  “degree”  and  “confi¬ 
dence"  appear  in  my  definition  of  “validi¬ 
ty,”  because  models  are  seldom  perfectly 
valid  in  any  of  the  dimensions  (description, 
structure,  or  pndiction).  They  vany  in  their 
accuracy  and  precision.  Also,  there  are 
several  dimensions  of  confidence,  since; 

•  The  model  or  its  data  may  be  known 
to  be  highly  uncertain  (e.g.,  in  func¬ 
tional  form  or  in  data  values). 

•  The  model  and  its  data  may  rep¬ 
resent  a  best-estimate  consensus  of 
experts,  but  may  nonetheless  be  fun¬ 
damentally  wrong  (e.g.,  Ptolemaic 
astronomy).  One  dimension  of  confi¬ 
dence,  then,  relates  to  assessing  the 
likelihood  of  the  bare  model  or  its 
data  having  serious  flaws  that  have 
not  yet  been  thought  of  or  taken 
seriously. 

•  A  model  may  be  detenninistic,  while 
the  relevant  world  may  be 
stochastic.”  In  this  case,  confidence 
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in  the  model’s  predictiveness  de¬ 
pends  on  the  underlying  probability 
distributions.  If  the  distribution 
function  is  strongly  weighted  around 
a  central  point,  then  a  deterministic 
model  may  be  reasonable;  if  the 
function  is  bimodal,  then  such  a 
deterministic  model  may  be  down¬ 
right  misleading. 

For  all  of  these  reasons  the  process 
of  validation  should  include  reaching  explic¬ 
it,  albeit  often  subjective,  judgments  about 
the  confidence  one  places  on  the  model. 
These  can  be  aided  by  sensitivity  analyses 
coupled  with  analysis  assessing  how  much 
one  truly  knows  about  the  more  critical 
variables  in  the  context  of  a  shooting  war. 

Some  examples  may  be  useful  here 
to  illustrate  how  central  the  issue  of  confi¬ 
dence  really  is  in  the  use  of  military  models. 
Consider  the  following  hypothetical  state¬ 
ments  about  .models  being  made  by  analysts 
to  general  officers  in  the  context  of  a  real 
war  or  preparations  for  such  a  war: 

The  strategic-mobility  model 
itself  is  solid,  for  aggregate 
predictions,  but  predictions 
depend  on  planning  factors 
and  decisions.  We  should 
plan  for  buildup  rates  -I-/- 
30%  around  baseline  data. 

Also,  we  should  recognize 
that  the  CINC  may  make 
significant  changes  in  the 
Time-Phased  Force  Deplc  y- 
ment  List  (TPFDL),  so  we 
must  anticipate  the  kinds  of 
changes  he  would  most  likely 
seek  and  consider  their  con¬ 
sequences  on  predicted  build¬ 
up  rate. 


Because  of  uncertainties, 
including  random  factors  and 
intrabattle  decisions,  we  have 
no  confidence  in  predicting 
winner  or  loser  (or  low  casu¬ 
alties)— unless  we  can  stack 
the  deck  by  going  for  a  6: 1 
local  force  ratio  after  bomb¬ 
ing.  Then  we  would  be 
confident. 

Results  will  depend  on  sur¬ 
prise  and  speed.  That’s 
beyond  our  model’s  ability  to 
predict  well.  The  model  is 
descriptive  after  the  fact,  but 
that  doesn't  tell  us  what  we 
need  to  know  now.  We  can 
instead  tell  you,  as  a  com¬ 
mander,  how  quickly  we 
think  you  need  to  maneuver 
for  success,  based  on  intelli¬ 
gence  estimates  on  the  ene¬ 
my’s  reaction  times  and 
maneuver  speeds  as  judged 
from  doctrine  and  exercises 
over  the  last  few  years. 
Whether  you  can  do  that  is 
difficult  for  us  to  judge. 

The  ECM-ECCM  model  is 
very  accurate  for  aircraft 
flying  against  the  SA-99  as 
we  know  it,  but  the  enemy 
may  have  changed  subsys¬ 
tems,  in  which  case  noise 
jamming  would  be  unchanged 
but  false-target  generation 
might  not  work  at  all.  We 
simply  don’t  know  whether 
he  has  changed  systems. 

All  of  these  statements  could  be 
made  quantitative  to  avoid  ambiguity,  but 
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my  own  recommendation  is  to  use  the  lan¬ 
guage  of  odds  in  a  context  that  downplays 
confidence  and  reminds  everyone  of  the 
stakes  (e.g.,  mens'  lives)  rather  than  using 
the  language  and  tone  of  statistical  precision. 
As  an  example: 

If  we  have  characterized  the 
SA-99  correctly,  as  we  think 
we  have,  our  ECM  should  be 
less  than  1  %  (between  about 
0.5%  and  1  %).  If  the  ene¬ 
my  has  changed  subsystems 
and  can  defeat  our  false- 
target  generation  (this  is 
highly  subjective,  but  I’d  say 
that’s  a  l-in-4  situation),  then 
our  rough  calculations  sjg- 
gest  our  losses  will  be  about 
1-2%  per  sortie  until  we  can 
destroy  the  SAMs.  Even  in 
the  bad  case,  we  estimate  that 
losses  won’t  be  worse  than  3- 
4  %  per  sortie  because  they 
have  a  limited  number  of 
SAMs.  That  loss  rate  might 
last  up  to  three  or  four  days, 
but  we’re  very  confident  we 
will  destroy  the  SAMs  in  no 
more  than  that  time. 

Data  Validation 

In  most  of  this  study  I  treat  data  vali¬ 
dation  as  part  of  validation  generally.  It  is 
worth  mentioning  some  unique  features  of 
data  validation,  however.  These  relate 
primarily  to  the  types  of  data  one  uses  to 
introduce  facts,  official  estimates,  and  other 
numbers  rather  than,  say,  the  types  of  data 
one  may  use  to  define  aspects  of  the  model 
(e.g.,  spatial  resolution  or  exponents  in 
equations).  In  this  activity  one  typically 
reviews  the  data  sources  and  how  they  were 
collected  to  compare  model  input  data  to 


real-world  or  best-estimate  values.  This 
may  involve  assessing  the  credibility  of  data 
sources  and  comparing  alternative  data 
bases.  In  reviewing  operational  data,  one 
must  consider  exercise  artificialities  such  as 
safety-related  constraints  and  geography. 
Data  validation  is  often  quite  troublesome. 
Intelligence  estimates,  for  example,  may 
vary  widely  with  little  rationale  given  and 
estimates  of  system  effectiveness  for  U.S. 
weapons  are  often  extrapolations  from  small 
data  samples  collected  under  artificial  condi¬ 
tions. 

6.2.5.  ACCREDITATION 


Accreditation  (often  used  syn¬ 
onymously  with  certification)  is  an 
official  determination  that  a  model  is 
acceptable  for  a  specific  purpose 
(e.g.,  to  a  class  of  applications  or  to 
a  particular  analysis  or  exercise,) 


Accreditation  Bv  Class  of  Application  Vs 
Specific  Application 

Except  for  the  parenthetical  phrases, 
this  is  a  commonly  accepted  definition  (e.g. , 
Williams  and  Sikora,  1991  and  U.S.  Anny, 
1992).  It  says  that  accreditation  is  a  deci¬ 
sion  (not  just  a  process)  to  the  effect  that  a 
given  level  and  character  of  verification  and 
validation  arc  sufficient  to  justify  using  a 
model  in  a  particular  application.'^ 

Problems  arise  not  with  the  defini¬ 
tion,  but  with  what  organizations  charged 
with  model  VV&A  sometimes  try  to  do.  It 
would  be  convenient  for  such  organizations 
if  models  could  be  definitively  accredited  for 
broad  classes  of  applications,  but  even 
within  a  given  class  of  applications  (e.g., 
weapon-system  comparisons),  a  model  will 
sometimes  be  adequate  and  sometimes  not. 
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Which  situation  applies  depends  on  details, 
including  numerical  details  and  the  sensitivi¬ 
ty  of  results  to  errors  in  model  performance. 
Also,  some  models  that  might  be  thought 
inappropriate  to  a  particular  application  can 
be  used  effectively  if  manipulated  cleverly 
with  the  benefit  of  parametric  variations 
informed  by  side  calculations.”  It  follows 
that  class-level  accreditation  should  be 
provisional  only,  and  that  accrediting  au¬ 
thorities  should  be  extremely,  cautious  in 
claiming  that  models  cannot  or  should  not 
be  used  for  applications  within  a  given 
class.  Those  long  familiar  with  VV&A 
issues  and  organizational  behavior  are  per¬ 
haps  most  concerned  about  this  problem, 
because  they  see  the  potential  for  mischief 
when  controversial  studies  use  models. 
Another  concern  here  stems  from  the  obser¬ 
vation  that  organizations  sometimes  insist 
that  “accredited  models”  be  used  for  studies 
even  when  those  models  are  inappropriate 
compared  to  alternatives  that  have  not  yet 
been  accredited,  or  even  fully  develop^. 
Furthermore,  many  fear  that  the  accredita¬ 
tion  process  will  place  too  much  of  a  premi¬ 
um  on  verisimilitude  and  too  little  emphasis 
on  clarify,  controllability,  and  efficiency. 

A  Cmcial  Issue  in  Sound  Accreditation: 
Model  Clarity 

It  is  perhaps  a  symptom  of  the  dis¬ 
connect  between  analysts  and  those  who 
build  and  sponsor  models  that  discussions  of 
VV&A  seldom  mention  one  of  the  most 
important  considerations  in  evaluating  a 
model:  its  clarity.  One  could  argue  that  the 
definition  of  validation  should  be  modified 
to  include  such  considerations,  but  I  have 
cl^osen  in  this  study  to  argue  that  these  con¬ 
siderations  are  very  much  in  the  province  of 
those  who  oversee  particular  uses  of  models. 
They  have  an  important  stake  in  model 
clarity,  because: 


•  They  are  responsible  for  results  and 
their  ability  to  review  the  work  (or 
have  it  reviewed  by  independent  ex¬ 
perts)  depends  on  their  ability  to 
comprehend  the  model  and  the 
cause-effect  relationships  dominating 
results. 

•  They  are  responsible  for  commu¬ 
nicating  results,  which  typically  re¬ 
quires  separating  essentials  from 
noise. 

•  They  may  want  to  be  able  to  repro¬ 
duce  the  work,  which  will  he  far 
easier  if  it  has  been  conducted  with 
a  comprehensible  model. 

It  follows,  then,  that  accreditation 
should  depend  not  only  on  the  soundness  of 
the  model  for  the  application  at  hand,  but  on 
the  ease  with  which  the  model  can  be  com¬ 
prehended  and  the  results  of  he  model 
understofxl  in  terms  of  apprcpriaie  cause- 
effect  relationships.  That  is,  model  accredi¬ 
tation  should  depend  not  only  upon  model 
soundness  for  the  application,  but  also 
upon:  (a)  comprehensibility  of  the  model  and 
(b)  comprehensibility  of  model  runs  (through 
“explanation  capabilities”).  This  facet  of 
the  problem  has  been  greatly 
underappreciated  in  prior  discussions  of 
VV&A,  even  within  the  academic  communi¬ 
ty  and  even  by  systems  analysts,  who  cer-- 
tainly  wax  eloquent  about  the  need  for 
model  simplicity  in  other  contexts.  I  ob¬ 
serve  also  that  the  importance  of  model 
clarity  increases  the  importance  of  estab¬ 
lishing  a  model’s  descriptive  validity. “ 

6.3.  A  TAXONOMIC  VIEW:  THE 

CONSTITUENTS  OF  W&A 
6.3.1.  Prefatory  Distinctions 

Given  the  above  definitions,  how 
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does  one  accomplish  VV&A?  Suppose  one 
is  attempting  to  establish  a  regime 

within  an  organization,  a  regime  in  which 
one  routinely  does  virtuous  evaluation  before 
using  models  for  analysis.  How  does  one 
go  about  it? 

It  is  useful  first  to  make  some  dis¬ 
tinctions: 

•  Components  vs.  system  (or  modules 
vs.  integrated  model) 

•  Bare  models  vs.  data 

•  Evaluating  “beste.stimate”  functional 
forms  and  data  values  vs.  evaluating 
ranges,  distributions,  and  confidence 

•  Conducting  “broad  VV&A”  with 


only  a  partial  sense  of  the  intended 
applications  vs.  conducting  focused 
VV&A  for  a  particular  study^' 

VV&A  applies  to  each  half  of  each 
of  these  pairs.  I  emphasize  this  up  front, 
rather  than  repeating  it  at  every  point  of  the 
following  discussion. 

6.3.2.  A  STRUCTURAL  PERSPECTIVE: 
THE  COMPONENTS  OF  VV&A 

Figure  VI- 1  now  provides  a  struc¬ 
tural,  or  taxonomic,  view  of  what  consti¬ 
tutes  VV&A.  It  elaborates  on  validation, 
because  that  aspect  has  been  most  controver¬ 
sial  and  confusing  over  the  years.  I  use  the 
phrase  “generalized  validation”  or  “evalua¬ 
tion”  here,  because  my  sense  of  validation  is 
broader  than  that  of  some  authors. 


FIGURE  VI-1.  A  Taxonomic  View  of  VV&A 
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6.3.3.  Verification  Methods 

Although  this  study  does  not  empha¬ 
size  verification  methods  (see  Sargent,  1987 
and  Martin-Marrietta,  1990  for  more  discus¬ 
sion),  the  traditional  methods  include  (a) 
walking  through  the  design  and  code;  (b) 
studying  flow  diagrams;  (c)  checking  algo¬ 
rithms;  and  (4)  using  CASE  tools.  Signifi¬ 
cantly,  modem  software  methods  coupled 
with  the  development  of  expert  systems  to 
assist  verification  can  greatly  improve  the 
quality  of  models  and  the  efficiency  of  the 
verification  process  (e.g.,  by  detecting 
errors  when  they  are  introduced).  Many  of 
the  methods  seem  mundane  when  described, 
and  may  seem  burdensome  to  those  who 
must  do  the  typing  of  code,  but  they  are  ex¬ 
ceptionally  powerful  and  have  not  yet  been 
fully  e.xploited.  Examples  with  which  I  am 
personally  familiar  include:^^ 

•  Strong  typing  in  computer  languages, 
which  detects  a  wide  variety  of  typo¬ 
graphical  errors  and  ambiguities  such 
as  having  different  names  for  the 
same  variable  or  different  variables 
with  the  same  name. 

•  Range  constraints  on  variable  values, 
which  are  entered  (as  data)  at  the 
time  variables  are  declared  and 
which  allow  the  executing  program 
to  become  aware  of  likely  errors  (as 
evidenced  by  variables  taking  on 
values  outside  the  prescribed  ranges) 
and  to  print  error  messages. 

•  Automatic  testing  for  logical  com¬ 
pleteness  in  decision  tables  and 
equivalent  sets  of  If-Then-Else  loops. 

•  Well  structured  “explanation  logs”  at 
alternative  levels  of  detail,  which 
allow  a  reviewer  quickly  to  scan  not 


only  final  results  but  values  of  inter¬ 
mediate  variables  and  the  logical 
paths  being  taken  in  the  simulation. 

•  Use  of  object-oriented  design  meth¬ 
ods,  which,  when  physically  natural, 
provide  improved  modularity  and 
better  organized  data  stnictures  that 
simplify  verification. 

These  techniques'^  can  be  especially  useful 
for  verification  of  implementation  in  code, 
but  can  also  be  useful  in  highlighting  spuri¬ 
ous  logic  (e.g,  in  explanation  logs). 

6.3.4.  Validation  Methods 

Validation  as  a  Holistic  Process 

Most  experienced  modelers  and  ana¬ 
lysts  consider  validation  to  be  a  holistic 
evaluative  process  that  includes  many  differ¬ 
ent  kinds  of  testing.  Some  of  this  may  be 
classic  empirical  testing  of  the  sort  often 
associated  with  the  scientific  method.  In 
practice,  however,  it  is  only  rarely  possible 
in  policy  analysis  to  conduct  the  controlled 
experiments  necessary  for  such  rigorous 
testing  of  the  model  as  a  whole.  Where 
such  experiments  are  feasible,  they  should 
be  greatly  valued,  but  we  cannot  conduct 
controlled  wars  or  even  perfectly  controlled 
battles  (nor  can  we  conduct  perfectly  con¬ 
trolled  social  experiments  on  matters  such  as 
health  care  options). 

We  must  settle  for  something  a  good 
deal  less  than  idealized  scientific  rigor. 
Nonetheless,  there  is  ample  opportunity  for 
empirical  work.  As  suggested  by  the  empir¬ 
ical-evaluation  column  of  Figure  VI-1, 
some  asjjects  of  models  can  be  tested  or 
informed  by  comparisons  with  historical 
data,  field-test  data,  or  data  from  operational 
maneuvers  and  other  exercises.  This  data  is 
not  usually  as  well  controlled  or  as  directly 
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relevant  as  one  might  like,  but  it  is  very 
useful  nonetheless. 

Looking  to  the  central  column  of 
Figure  VI-1,  other  less  empirical  methods 
should  be  key  players  in  generalized  valida¬ 
tion.  The  first  is  theoretical  analysis  (e.g., 
working  through  the  substantive  logic, 
checking  relevant  verisimilitude,  considering 
the  reasonableness  of  assumptions,  applying 
criteria  such  as  requiring  falsifiability-^  and 
the  use  of  Ockham’s  razor,  and  comparing 
assumptions  and  implications  of  the  model 
with  well  established  theories  from  physical 
science,  engineering,  and  military  science^*). 
Theoretical  analysis,  then,  goes  well  beyond 
what  is  suggested  by  the  phrase  “logical 
validation,”  which  sometimes  appears  in 
discussion  of  VV&A  (e.g.,  Williams  and 
Sikora,  1991).  Theoretical  analysis  often 
exploits  special  cases  in  which  it  is  possible 
to  compare  the  model  in  question  with  exact 
calculations  based  on  rigorous  or  otherv.'ise 
well  established  theories.”  Sargent  (1986, 
1987)  lists  some  of  the  various  methods  that 


can  be  used  in  this  connection. 

Looking  to  the  rightmost  column  of 
Figure  VI-1,  there  are  a  variety  of  other 
comparisons  one  can  make  to  evaluate  a 
model.  These  include  comparisons  with 
expert  opinion,  doctrine,  and  so  on.  Final¬ 
ly,  Figure  VI-2  emphasizes  that  these  evalu¬ 
ations  all  feed  into  an  overall  evaluation 
holistically.  There  is  no  natural  order  or 
ranking  of  evaluation  methods,  despite 
efforts  to  create  one  (e.g.,  as  discussed 
ambivalently  in  Williams  and  Sikora,  1991, 
although  subsequent  MORS  works  has 
dropped  the  effort  to  impo.se  an  order). 
This  is  not  entirely  trivial,  since  false  ideals 
cause  trouble  and  the  ideal  of  believing,  for 
example,  that  data  from  maneuvers  is  the 
“best"  and  “most  important”  data  to  be  used 
in  validating  a  model  will  typically  be 
wrong.  Basically,  model  development  and 
evaluation  involves  using  many  sources  of 
information  and  tying  it  together  however 
one  can.  It  is  not  so  orderly  as  some  would 
have  it.” 
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FIGURE  VI-2.  Validation  as  a  Holistic  Process,  Not  a  Linear  Process 
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A  Perspective  on  Validation 

It  is  sometimes  useful  to  think  about 
validation  as  an  informal  application  of 
Bayesian  reasoning  under  circumstances  in 
which  we  can  only  estimate  the  probabili¬ 
ties.  Our  objective  is  to  develop  representa¬ 
tions  that  are  good  enough  “to  bet  on;”  but 
we  will  seldom  have  a  sure  bet  and  we  the 
refore  want  to  have  a  sense  of  the  odds  for 
each  of  a  number  of  very  different  kinds  of 
wagers.^’  This  validation  process  is  un¬ 
questionably  subjective,  but  not  capriciously 
so.  We  consciously  seek  information  that 
could  falsify  or  reinforce  our  judgments  and 
we  attempt  to  face  up  to  that  information 
when  we  obtain  it.  When  all  is  said  and 
done,  however,  we  must  do  something. 
That  is,  we  must  conduct  the  best  analysis 
possible  given  the  information,  time,  and  re¬ 
sources  available  to  us.  Ultimately,  valida¬ 
tion  (and  accreditation  as  well)  is  con¬ 
cerned  with  establishing  that  we  are  indeed 
doing  the  best  we  can— or,  at  least,  some¬ 
thing  that  is  “good  enough.”  It  cannot  be 
separated  completely  from  context.^® 

Issues  of  Breadth  and  Etepth  in  Model  Vali- 
daliOQ 

A  model’s  validity  is  one  thing;  the 
extent  to  which  it  has  been  validated  is 
another  (i.e.,  a  good  model  may  not  yet  be 
known  to  be  good).  A  common  problem  for 
those  overseeing  the  development  and  use  of 
models  is  “How  much  validation  is 
enough?”  Another  question  is  “How  do  we 
start?”  Figure  VI-1  provides  a  checklist  of 
methods,  but  most  of  them  could  become 
lifetime  careers  when  dealing  with  complex 
models.  It  is.  therefore  u.seful  to  make 
some  further  distiiiCtions,  which  also  have 
the  effect  of  suggesting  where  to  start. 

Depth  in  Validation.  As  with  most 
human  endeavors,  the  value  of  validation 


activity  is  described  by  a  curve  of  marginal 
returns— a  curve  that  rises  steeply  and  then 
begins  to  level  off  and  move  slowly  toward 
an  asymptote  (which  may  correspond  to 
considerable,  and  yet  incomplete,  confi¬ 
dence).  For  a  variety  of  reasons,  some  of 
which  could  probably  be  explained  theoreti¬ 
cally,  it  seems  to  be  the  case  that  even  a 
little  validation  can  go  a  long  way.  It  is  for 
this  reason  that  “face  validity  assessments” 
are  so  important  in  practice.  These  can  be 
attempted  in  each  and  every  validation- 
related  box  of  Figure  V-1.  Some  examples 
will  probably  convey  the  ideas.  Once  again 
I  use  the  technique  of  plausible  statements 
that  might  be  made  in  characterizing  a  mod¬ 
el’s  validity: 

Using  historical  data.  The 
model  is  absurd.  It  took  me 
all  of  30  seconds  to  discover 
from  the  output  graphics  that 
it  has  field  armies  moving  at 
an  average  speed  of  150 
km/day  over  the  course  of  a 
successful  ten-day  campaign. 
Probably,  some  nitwit  physi¬ 
cist  built  the  post-break- 
through  movement  algorithms 
after  thinking  about  how  fast 
tanks  can  drive.  Historically, 
opposed  movement  has  been 
more  like  20  km/day,  al¬ 
though  there  have  been  spe¬ 
cial  cases.^' 

Using  field-test  and  exercise 
data.  The  model  is  exceed¬ 
ingly  optimistic  about  the 
effectiveness  of  TOW  mis¬ 
siles  (kills  per  shot  and  shots 
per  battery  per  battle),  proba¬ 
bly  because  of  using  test- 
range  data  uncritically. 
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Results  from  the  National 
Test  Range  and  Desert  Storm 
give  a  very  different  picture. 

Using  simulator  data  (a  kind 
of  laboratory  data).  The 
model  for  pilot  acquisition 
rates  in  finding  mobile  targets 
is  in  fact  more  reliable  than 
what  the  pilots  are  telling  us 
anecdotally  based  on  normal 
training  practice.  There  have 
been  some  experiments  in 
simulators  that  demonstrate 
pilots  are  much  more  conser¬ 
vative  about  declaring  a 
target  detection  when  they 
are  concerned  about  friendly 
forces  being  in  the  region  or 
about  hitting  civilian  targets. 
In  terms  of  the  required 
signal-to-noise  ratio,  the 
difference  is... 

Testing  for  analytic  and 
scientific  rigor.  1  quit  read¬ 
ing  the  documentation  as 
soon  as  I  discovered  that  the 
detection  model  assumes  a 
uniform  background  over 
areas  as  big  as  middle-eastern 
countries.  We  know  that  the 
ability  to  track  a  target  (not 
just  detect  it  once)  depends 
on  being  able  to  maintain  a 
reasonable  signal-to-noise 
ratio,  and  that  background 
varies  substantially  over 
distances  of  tens  of  meters, 
even  in  the  desert.  I  also 
note  that  the  model  ignores 
the  effects  of  cueing  and 
prior  knowledge  by  using 
independent  probabilities.  We 


need  a  better  acquisition 
model. 

Looking  for  relevant  veri¬ 
similitude.  The  model  treats 
logistics  quite  cmdely,  at  the 
level  of  tons  per  day  of  con¬ 
sumption,  tons  on  hand  (by 
sector),  etc.  However,  it 
looks  about  right  in  aggre¬ 
gate:  divisions  in  intense 
combat  use  about.. .tons  p)er 
day,  but  intensity  seems  to 
drop  pretty  quickly,  which  is 
reasonable.  The  real  prob¬ 
lem  is  that  there  is  no  mecha  ¬ 
nism  in  the  model  for  one 
side  to  affect  the  other  side’s 
supply  capability.  The  model 
is  stnicturally  unsound  in  that 
respect.  It  doesn’t  even 
model  support  units  and 
allow  attacks  on  their  trucks. 

Evaluation  for  economy.  The 
model  may  or  may  not  be 
accurate  if  one  knows  all  the 
input  variables  precisely,  but 
it’s  going  to  be  impossible  to 
use  well  for  systems  analysis 
in  realistic  cases  where  we 
don't  know  those  values  in 
many  cases.  The  model  has 
so  many  tuning  parameters  it 
could  fit  anything  after  the 
fact,  but  I  don’t  think  it’s 
worth  much  for  our  purposes. 

Comparisons  with  familiar 
models.  Well,  it’s  a  different 
model,  of  course,  and  there 
are  scores  of  parameters  that 
I  didn’t  try  to  review  in 
detail,  but  the  model  at  least 
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behaves  reasonably  in  the 
sense  that  it  gives  the  same 
picture  of  what  would  happen 
in  the  several  baseline  cases 
of  the  ...study  as  came  out  of 
the  full-up  war  game  at 
CINC  headquarters. 

All  of  these  examples  could  have 
been  the  result  of  fairly  casual  checks  of 
face  validity  by  different  experts.  None 
involved  detailed  testing.  In  my  experience, 
tests  of  face  validity,  in  many  dimensions,  is 
extremely  valuable  in  uncovering  the  most 
serious  errors.  It  is  a  prerequisite,  however, 
that  the  model  be  well  documented  and  that 
it  be  easy  for  experts  to  view  its  behavior 
(e.g.,  through  interactive  post  processing 
graphics  rather  than  fixed  hard-copy  out¬ 
puts). 

Methods  of  face-validity  testing  de¬ 
pend  heavily  on  such  things  as  the  follow¬ 
ing:^^ 

•  Having  a  good  set  of  baseline  cases 
(standard  scenarios)  with  which  the 
reviewers  are  familiar 

•  Displays  of  aggregated  behavior 
(e.g.,  total  divisions  deployed  in 
theater  vs  time  or  average  divisional 
loss  rates  when  in  combat  vs  time) 

•  Highly  organized  and  comprehen¬ 
sible  overviews  of  model  approach, 
assumptions  and  parameter  values 
(more  generally,  good  documentation 
is  essential;  see  Annex  B  for  more 
discussion  of  documentation) 

•  The  ability  to  respond  quickly  to 
spot-check  requests  (e.g.,  “What  did 
you  assume  for  the  value  of and 


“What  does  the  plot  of  ...vs  time 
look  like?”  and  “Show  me,  in  code, 
the  algorithm  (or  rules)  you  used 
for...”) 

•  The  ability  to  do  additional  spot¬ 
checking  cases  upon  demand  (e.g., 
“Let’s  see  what  happens  when  you 
assume  the  B-l’s  ECM  doesn’t 
work.”) 

The  dangers  of  depending  only  on 
face  validity  are  obvious,  but  they  can  be 
mitigated  if  the  effort  to  do  face-validity 
checks  is  broad  enough,  includes  opportuni¬ 
ties  for  spot  checking  in  depth,  is  accom¬ 
plished  with  reviewers  having  a  range  of 
backgrounds,  and  mixes  review  of  “inputs” 
(model  structure,  assumptions,  etc.)  and 
“behavior.”  One  reason  such  testing  is  so 
valuable  is  that  poorly  done  models  often 
fail  immediately,  whereas  well  done  models 
are  the  result  of  serious  and  professional 
efforts  in  which  testing  and  validity-related 
discussions  are  an  everyday  way  of  life  for 
developers.  Given  such  efforts,  intensive 
review  sessions  can  cover  a  great  deal  of 
ground  quickly  because  the  developers  are 
“on  top  of  the  problem”  and  have  organized 
information  well. 

Detailed  validation  efforts  must  de¬ 
pend  primarily  on  module-by-module  testing 
during  development  and  on  special  meetings 
to  examine  critical  modules  in  depth.  It  is 
seldom  possible  with  large  military  models 
to  do  anything  like  comprehensive  testing  or 
evaluation  of  complete  multi-module  sys¬ 
tems.” 

Special  Issues  With  Knowledge-Based  Mod¬ 
els. 

Knowledge-based  models  such  as 
rule-based  or  algorithmic  and  rule-based 
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decision  models  representing,  e.g.,  military 
commanders  or  operators  of  air  defense 
systems,  raise  special  issues  because  in  most 
cases  they  cannot  in  principle  be  validated  in 
the  sense  of  being  favorably  compared  with 
“the  real  system.”  Instead,  they  must  be 
evaluated  on  grounds  such  as  whether  they 
faithfully  represent  the  knowledge  of  rele¬ 
vant  experts,  whether  they  are  logical, 
internally  consistent  and  consistent  with 
various  physical  and  logical  constraints,  and 
so  on.”  They  can  in  some  cases  be  falsi¬ 
fied  by  real-world  experience  in  which  other 
variables  proved  to  be  critical,  but  ambitions 
must  be  limited.  Further,  there  is  a  wealth 
of  information  to  the  effect  that  experts 
often  give  misleading  testimony  about  what 
they  would  do  in  various  circumstances  and 
about  the  way  in  which  they  reason— not 
because  they  intend  to  mislead,  but  because 
they  have  only  a  limited  understanding  of 
their  own  cognition.  For  example,  when 
being  interviewed  experts  might  describe  a 
highly  rational  process  of  making  decisions, 
but  in  the  heat  of  actual  operations— with 
uncertainties,  fatigue,  and  time  pressures  all 
being  factors— their  behavior  might  reduce 
to  the  simplest  of  patterns,  some  of  them 
“irrational”  from  the  viewpoint  of  a  decision 
theorist.  To  make  things  worse,  most  ex¬ 
perts  have  never  encountered  many  of  the 
situations  for  which  we  may  be  asking  them 
to  predict  behavior.  Thus,  they  are  not 
really  exp)erts  in  the  same  sense  that  an 
experienced  internist  is  an  expert  on  child¬ 
hood  diseases. 

It  follows  from  this  that  efforts  to 
validate  knowledge-based  models,  notably 
behavioral  models  of  various  types,  includ¬ 
ing  decision  models,  must  depend  much 
more  heavily  than  one  might  like  on  combi¬ 
nations  of  theory,  logic,  and  spotty  expres¬ 
sions  of  expert  opinion.”  It  is  essential  that 


efforts  to  build  such  models  be  highly  orga¬ 
nized  and  that  appropriate  testing  methods 
be  developed.  This  is  an  understudied  field, 
but  some  relevant  methods  that  have  been 
applied  in  a  number  of  domains  are  de¬ 
scribed  in  Veit,  Callero,  and  Rose  (1984) 
and  Veit  (forthcoming).  These  involve 
developing  rigorous  factorial  designs  for 
comparing  model  behaviors  with  behaviors 
of  relevant  experts,  preferably  in  circum¬ 
stances  approaching,  those  that  would  be 
encountered  in  the  real  world,  but  perhaps 
in  war  games  as  a  next-best  choice.  Anoth¬ 
er  valuable  empirical  approach  is  to  obser\'e 
experts  performing  in  field  exercises.  This 
can  usefully  supplement  interview  data  and 
theoretical  analysis  by  bringing  in,  to  some 
extent  at  least,  aspects  of  behavior  under 
stress  and  the  fog  of  war. 

6.3.5.  Methods  of  Accreditation 

There  are  various  organizational  ap¬ 
proaches  to  accreditation,  but  this  subject  is 
best  discussed  in  the  next  section. 

6.4.  A  DYNAMIC  VIEW  OF  VV&A 

Overview 

Figure  VI-3  shows  a  dynamic  view 
of  VV&A  that  emphasizes  evaluation  and 
accreditation  of  a  model  in  the  context  of  a 
specific  study.”  The  importance  of  context 
is  emphasized  by  putting  the  analytic  plan  in 
the  center.  It  is  here  one  starts— knowing  of 
course,  the  purposes  of  the  analysis.  Provi¬ 
sional  accreditation  for  a  class  of  applica¬ 
tions  could  emerge  from  a  similar  chart,  but 
I  will  not  deal  with  that  further  in  this  study. 

When  evaluating  a  model  for  a  spe¬ 
cific  application,  chances  are  that  the  model 
is  an  adaptation  of  a  previous  model  that  has 
been  subjected  to  some  degree  of  VV&A  or 
that  the  model  has  been  subjected  previously 
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FIGURE  VS-3.  VV&A  as  a  Continuing  Procoss  Sensitive  to  Context 
I  Process  starts  at  the  center) 


to  considerable  “general”  VV&A  without 
the  benefit  of  study-specific  information.” 
Thus,  the  new  round  of  VV&A  shown  in 
Figure  VI-3  draws  on  previous  information 
(see  arrows  coming  in  from  top  left).  Most 
impoitantly,  however,  it  depends  heavily  on 
the  study-specific  requirements  and  test 
cases.  In  practice,  relatively  complex  com¬ 
bat  models  (or  most  other  models  used  in 
policy  analysis)  are  never  fully  tested  and 
unconditionally  accredited.  Testing  can  still 
be  extensive  and  sophisticated  for  the  pur¬ 
poses  of  evaluating  the  model  and  its  data  in 
the  context  of  a  specific  analysis.  That 
testing  is  the  basis  for  study-specific  accredi¬ 
tation,  but  it  also  adds  to  the  base  of  VV&A 
information  that  will  be  used  in  the  next 
iteration  for  a  new  application  (see  outward 


arrows  on  bottom  left  and  center  right).  One 
feature  of  Figure  VI-3  (bottom  right)  is 
especially  impoitant  and  unusual.  This  is  its 
reference  to  constraints  and  guidance  as 
outputs  of  the  accreditation  process.  Since 
the  most  stringent  review  of  an  analytic 
organization’s  work  usually  occurs  within 
the  organization  itself,  one  may  think  of 
“accreditation”  as  being  the  result  of  man¬ 
agement  reviews  of  the  sort  that  should 
occur  early  in  a  project’s  life,  before  the 
project’s  work  is  reported,  and,  if  possible, 
at  least  once  in  between.  The  result  of  such 
a  review  might  take  the  following  form 
(think  of  this  as  the  summary  conclusions  of 
the  relevant  manager,  who  need  not  be  a 
government  official): 
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On  balance,  our  conclusions 
are: 

(1)  The  analytic  plan  ap¬ 

pears  to  be  sound. 

(2)  The  model  and  data 

base  for  carrying  out  the  plan 
appear  to  be  sound. 

(3)  Consistent  with  the 

improved  plan,  however,  no 
conclusions  should  be  drawn 
regarding. . . ,  because  the 
analysis  cannot  support  them. 
Further,  in  drawing  conclu¬ 
sions  on...,  it  is  essential  that 
they  reflect  parametric  varia¬ 
tions  on  the  following  key 
variables  over  the  ranges 
discussed  in  the  review. 
Recipients  of  the  analysis 
must  understand  the  consider¬ 
able  uncertainty  associated 
with...  , 

(4)  Further,  recipients  of 
the  analysis  must  be  remind¬ 
ed  of  the  following  basic 
assumptions  of  the  approach, 
which  appear  reasonable,  but 
which  also  establish  limita¬ 
tions  on  its  significance:... 

In  this  depiction  there  is  no  all-or- 
nothing  blessing  of  the  model— even  for  a 
specific  study.  Instead,  the  accreditation  is 
conditional  upon  the  analysis  plan  itself, 
which  includes  the  proposed  logic  to  estab¬ 
lish  conclusions.  Further,  the  accreditation 
process  often  results  in  changes  of  the  ana¬ 
lytic  plan  itself  (and  changes  in  the  model 
leading  to  another  round  of  verification). 
This  iteration  is  merely  implicit  in  Figure 


In  concluding  that  a  model  could  rea¬ 
sonably  be  used  for  the  purpose  at  hand,  the 
accrediting  authority  might  be  drawing  on 
highly  study-specific  information  and  pon¬ 
dering  in  some  detail  precisely  what  function 
the  model  itself  is  serving  (see  Hodges  and 
Dewar,  1992  for  a  list  of  such  functions  and 
related  discussion). 

One  can  imagine  judgments  such  as 
the  following  being  made  as  part  of  the 
accreditation  decision  and  explanation: 

The  model  is  suitable  here 
(e.g.,  in  a  war  game  being 
used  for  higher  level  educa-  .  f 

tion  and  training).  Realis¬ 
tically,  it  is  being  used  pri¬ 
marily  as  an  organizing 
device,  as  a  kind  of  book¬ 
keeping  mechanism.  The 
results  of  the  analysis  depend 
most  sensitively  on  the  hu¬ 
man  command-control  deci¬ 
sions,  including  operational 
strategies.  The  model’s 
treatment  of  attrition  is  fairly 
crude,  but  as  you  have  shown 
with  your  .sensitivity  anal¬ 
yses,  the  attrition  model  is 
not  the  limiting  factor. 

The  model  is  quite  suitable 
here,  despite  its  exceptionally 
simple  treatment  of  close  i  i 
combat.  , 

The  results  depend  primarily 
on  the  air-to-ground  effec¬ 
tiveness  of  U.S.  air  forces, 
given  air  supremacy,  and  the 
time  required  for  us  to 
achieve  that  supremacy.  You 
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have  a  rather  detailed  and 
credible  treatment  of  both 
air-to-ground  effectiveness  as 
a  function  of  circumstance 
and  of  the  suppression  of  air 
defenses  (SEAD).^* 

You  must  be  kidding.  The 
model  can’t  possibly  be  used 
to  infer  conclusions  about  the 
proper  mix  of  tank  and  artil¬ 
lery  units,  because  it  bases 
ground  combat  attrition  on 
some  aggregation  expressions 
that  treat  MLRS  as  merely 
one  contributor  to  an  overall 
firepower.  Chances  are  the 
model  will  conclude  some¬ 
thing  like  “all  we  need  to  do 
is  buy  MLRS  batteries  and 
disband  the  rest  of  the  ar¬ 
my.”  That  would  be  fine  if 
battle  were  just  a  matter  of 
firepower. 

Yes,  I  know  that  you  think 
you  have  a  highly  sophisticat¬ 
ed  model  of  ground  combat, 
but  it  is  not  adequate  for  this 
study.  As  it  stands,  ground 
forces  are  unintimidated  by 
air  forces,  and  can  maneuver 
just  as  quickly  with  or  with¬ 
out  enemy  air  forces  attack¬ 
ing  them,  except  to  the  extent 
that  air  forces  can  destroy 
whole  units.  .  I  don’t  believe 
this  for  a  moment.  Air 
forces  can  disrupt  and  delay, 
and  thereby  greatly  affect 
maneuver  and  tempo  gen¬ 
erally.  Go  back  to  the  draw¬ 
ing  boards— and  read  some 
history  on  the  Battle  of  the 


Bulge,  especially  the  part 
after  the  weather  cleared. 

Your  model  seems  fine  so  far 
as  it  goes,  covering  attrition 
and  movement  processes,  but 
it  treats  operational  strategy 
as  input  data  and  doesn't 
allow  adaptation.  That  leaves 
out  the  most  important  part 
of  force  employment.  Good 
forces  and  bad  strategy  lead 
to  bad  results  (see,  e.g., 

Davis  and  Hillestad,  1992). 

An  important  point  to  be  made  here 
is  that  the  same  model  might  be  good  for 
some  force-composition  or  force-structure 
studies  and  altogether  inappropriate  for 
others.  Thus,  attempting  to  accredit  a 
model  for  whole  classes  of  studies  can 
readily  lead  to  bad  decisions.  It  would 
therefore  seem  appropriate  to  introduce  and 
u.se  the  concept  oH  provisional  accreditation 
(suggested  to  me  by  Clayton  Thomas), 
which  would  be  used  in  the  context  of  con¬ 
cluding  that  “This  model  (and  its  data  base) 
is  a  reasonable  candidate  for  use  in  this  kind 
of  study.  Go  ahead  and  flesh  out  {he  analy¬ 
sis  plan  and  let’s  then  see  whether  the  plan 
makes  sense  and  the  model  will  indeed  be 
adequate.”  This  emphasizes  yet  again  that 
it  is  the  analysis,  study,  or  other  application 
that  should  actually  be  “accredited.” 

6.5.  ESTABLISHING  A  W&A  RE¬ 
GIME  WITHIN  AN  ORGANIZATION 
6.5.1.  Prefacing  Comments 

In  thinking  about  VV&A  and  about 
how  to  improve  its  practice  in  organizations, 
it  is  important  to  recognize  that  VV&A 
should  not  be  seen  as  a  separate  and 
segmentable  enterprise— i.e.,  an  additional 
duty  or  task— but  rather  as  an  inherent  part 
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of  the  analytic  process  from  the  time  of 
initial  design  to  the  time  of  particular  appli¬ 
cations.  Validation  is  central  to  the  scientif¬ 
ic  process  that  good  analysis  seeks  to  emu¬ 
late.  I  raise  these  matters  here  because 
VV&A  is  not  always  viewed  in  this  way. 
Indeed,  there  are  many  considerations  that 
undercut  attempts  to  make  analysis  “scientif¬ 
ic.  ”  For  example,  models  are  often  tools  of 
advocacy;  further,  data  bases  are  often 
tightly  held  for  both  security  reasons  and 
infomiation-is-power  reasons.  As  a  result, 
there  are  significant  disincentives  for  orga¬ 
nizations  to  evaluate  their  models  and  data 
as  harshly  as  they  might  if  they  were  physi¬ 
cal  scientists  attempting  to  unravel  the  se¬ 
crets  of  the  universe.  It  is  therefore  a  sig¬ 
nificant  challenge  for  analytic  organizations 
to  rise  above  these  problems  and  instill  and 
maintain  a  sense  of  professionalism  and 
“scientific  method.”  This  is  a  continuing 
challenge,  not  one  that  can  be  addressed 
once  and  for  all  (see  also  Hughes,  1989,  pp 
10  ff).  With  this  background,  then,  let  us 
examine  how  an  organization  might  take  on 
the  challenge.” 

6.5.2.  Considerations 

Establishing  a  VV&A  regime  must 
first  be  recognized  as  involving  all  of  the 
standard  challenges  associated  with  organi¬ 
zational  change  and  learning.  Simple  de¬ 
crees  have  very  limited  and  short-term 
value.  Instead,  one  must  think  in  terms  of 
such  matters  as: 

•  Creating  and  communicating  a  vision 
of  professionalism  that  treats  VV&A 
as  inherent  to  good  work  and  some¬ 
thing  to  be  done  continuously  rather 
than  merely  in  occasional  painful  and 
unrewarding  crash  efforts. 

•  Developing  associated  policies  and 


procedures,  and  assuring  that  there 
are  early  examples  for  everyone  to 
see  of  how  these  will  be  implement¬ 
ed  in  practice  and  what  will  be  ac¬ 
complished. 

•  Bringing  members  of  the  organi¬ 
zation  into  the  problem  so  that  they 
participate  in  developing  aspects  of 
the  general  policies  and  many  of  the 
procedural  details  —thereby  assuring 
proper  tailoring  to  the  organization’s 
particular  culture. 

•  Establishing  the  uncomfortable  prin¬ 
ciple  of  independent  review,  for  at 
least  critical  features  of  the  work, 
even  though  the  tendency  within 
organizations  is  usually  to  assume 
that  internal  review  is  quite  adequate 
and  that  the  call  for  independent 
review  is  insulting  and  a  potential 
waste  of  time.'”’ 

•  In  all  of  this,  having  both  long-  and 
short-term  views  and  plans,  with 
short-term  efforts  being  designed  in 
part  to  illustrate  wl.at  is  intended  on 
a  continuing  basis  for  the  long  tenn. 

•  By  distinguishing  short-  and  long¬ 
term  plans,  assuaging  fears  about 
unreasonable  new  demands  being 
added  immediately  to  project  bur¬ 
dens. 

•  Assuring  that  those  contributing  to 
the  changes  are  properly  recognized 
and  rewarded. 

Many  aspects  of  this  clialienge  can 
be  helped  by  having  concrete  examples  to 
use  as  case  histories  that  everyone  reads. 
An  important  part  of  the  continuing  MORS 
effort  on  VV&A  is  to  develop  and,  if  possi- 
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ble,  to  publish  such  histories. 

6.5.3.  Using  the  Framework 

Against  this  general  backdrop  of 
challenges,  I  suggest  using  the  material  of 
this  study  as  follows: 

•  Use  the  definitions  and  related  dis¬ 
cussion  to  communicate  the  funda¬ 
mental  issues  of  VV&A. 

•  Use  the  taxonomy  of  VV'&A  meth¬ 
ods  (Figure  Vl-l)  to  broaden  per¬ 
spectives,  break  down  biases,  and 
help  establish  short-term  and  long¬ 
term  plans.  In  the  long-term  p  an, 
for  example,  one  might  want  to  use 
many  of  the  validation  techniques 
mentioned,  but  that  would  require 
scheduling  and  finding  support  for 
tasks,  or  even  whole  projects,  for 
work  that  would  not  ordinarily  be 
done  at  all  (e.g.,  comparisons  with 
experiences  in  field  maneuvers  or 
large-scale  exercises).  Thus,  the 
taxonomy  should  be  used  primarily 
as  a  checklist. 

•  Use  the  dynamic  view  of  VV&A 
(Figure  VI-2)  to  frame  the  issues  in 
a  realistic,  technically  solid,  and  non 
“political”  way.  Use  it  also  to 
develop  detailed  work  schedules  for 
projects— setting  aside  adequate  time 
for  iterative  reviews  and  follow-up 
mode]  adaptation  and  testing.  Use 
this  view  of  the  problem  to  highlight 
the  substantive  role  of  accreditation 
(as  distinct  from  the  more  political 
role  emphasized  by  cynics)  and  its 
intellectual  relationship  to  traditional 
guidelines  on  how  to  run  analysis 
projects,  guidelines  that  apply  also  in 
many  ways  to  applications  such  as 


support  of  exercises  and  development 
of  decision  aids. 

•  When  identifying  VV&A  require¬ 
ments  for  a  particular  analysis, 
explicitly  consider  the  costs  of  fulfill¬ 
ing  those  requirements.  Then,  either 
assure  that  the  requirements  can  be 
met  by  making  available  the  neces¬ 
sary  resources  and  calendar  time  or 
adjust  the  analyst  plan  (or  claims 
made  about  the  analysis  when  con¬ 
cluded).^' 

•  Take  seriously  the  discussion  of  how 
special  measures  need  to  be  adopted 
in  evaluating  knowledge-based  mod¬ 
els  and  other  models  for  which  hard 
data  is  lacking.  Use  the  examples 
provided  here  and  develop  important 
distinctions  for  the  problems  at  hand. 

•  U.se  Figures  VI- 1,  VI-2  and  related 
discussion  to  explain  to  sponsors 
how  VV&A  plans  are  consistent  with 
a  comprehensive  view  of  the  subject, 
drawing  also  upon  other  published 
materials  such  as  Sargent  (1987)  and 
methods  used  by  Martin  Marrietta 
(1990).  As  part  of  this,  focus  spon¬ 
sors  and  accrediting  authorities  (usu¬ 
ally  the  same  individuals)  on  the 
view  of  accreditation  that  encourages 
them  to  provide  intellectual  guid¬ 
ance,  not  merely  a  “yes”  or  “no” 
decision.  And,  as  part  of  this,  em¬ 
phasize  the  need  for  VV&A  activities 
to  be  adequately  supported  and 
scheduled  realistically  over  time. 

Finally,  let  me  mention  again  that  the 
examples  in  this  study  emphasize  applica¬ 
tions  in  which  models  are  used  for  analysis. 
Many  readers  will  wish  to  develop  analo- 


VI-22 


gous  examples  for  their  own  applications.  basic  framework  should  hold  up,  the  de- 

which  may  to  training,  educ  ition,  operation-  tailed  criteria  for  judging  models  is  very 

al  decision  aids  or  other  matters.  While  the  application  dependent.'*^ 
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ANNEX  A  -  ON  SEPARATING  CONCEPTUAL  MODELING  AND 

PROGRAMMING 


In  a  classical  ideal  with  which  I  long 
had  sympathy,  the  design  and  review  of 
models  (sometimes  called  conceptual  mod¬ 
els)  precedes  programming."  One  develops 
the  conceptual  picture  and  lays  out  the 
theory  and  algorithms  formally,  thereby 
creating  machine-  and  language- independent 
specifications  (se«,  e.g..  Figure  VI-A-1 
from  Sargent’s  work,  which  remains  useful 
even  if  my  arguments  here  are  accepted). 
Implementation  as  a  program  then  proceeds, 
'out  its  details  depend  on  hardware,  software, 
local  practices,  and  other  factors."  In  this 
ideal,  substantive  discussion  should  focus  on 
the  model,  not  the  program.  This  ideal  has 
much  to  recommend  it,  because  enormous 
confusion  is  caused  by  having  problem 
formulation  shaped  and  described  in  terms 
peculiar  to  particular  languages  or  computer 
systems. 

In  practice,  however,  the  ideal  breaks 
down  for  both  good  and  bad  reasons.  The 
principal  bad  reason  is  that  many  organiza¬ 
tions  lack  the  discipline  to  enforce  serious 
design  before  allowing  programmers  to 
write  code;  the  results  are  predictable: 
incomprehensible  models  that  are  merely 
implicit  in  long  and  complex  computer  code. 

The  good  reasons  have  to  do  with 
technology  and  the  changing  ways  in  which 
worker,  even  workers  with  a  theoretical 
ben,  go  about  their  efforts.  It  is  becoming 
increasingly  possible  and  attractive  to  work 
largely  at  the  computer  rather  than  with 
pencil  and  paper— even  for  constructing  top- 
down  conceptual  designs.  Second,  some  of 
the  computer  tools  for  doing  so  blur  the 


distinction  between  design  and  program¬ 
ming,  because  when  one  creates  the  initial 
design  elements  (e.g.,  variable  names,  data 
structures  such  as  objects,  functions,  and 
diagrams),  the  results  automatically  generate 
corresponding  program  elements  (see  Annex 
B).  Third,  with  some  high-level  languages, 
it  is  as  easy  for  reviewers  to  understand  and 
comment  upon  algorithms  expressed  as 
computer  code  (or  related  diagrams)  as  it  is 
for  them  to  do  so  in  a  paper-and-pencil 
mode."  Fourth,  advanced  tools  such  as 
Mathematica  now  make  it  possible  to  solve 
equations  symbolically  on  line,  which  en¬ 
hances  the  ctesign  process.  And,  lastly, 
statements  of  the  conceptual  model  often 
underspecify  the  problem,  resulting  in  pro¬ 
grammers  Ailing  in  and  thereby  having 
much  more  of  a  role  in  defining  the  “real" 
model  than  was  intended.  In  some  respects, 
it  is  only  realistic  to  force  model  designers 
to  address  explicitly  what  they  might  other¬ 
wise  tend  to  assume  are  mere  implementa¬ 
tion  issues  (e.g. ,  time  steps,  control  Aow  in 
procedural  problem-solving  approaches,  and 
whether  to  organize  around  data  structures 
or  processes). 

A  related  issue  here  is  that  of 
prototyping.  In  the  last  decade  workers 
have  come  to  appreciate  the  efficiency  of 
rapid  prototyping  as  a  mechanism  for  help¬ 
ing  designers  understand  the  problem  for 
which  they  are  tasked  to  build  models.  In 
practice,  it  is  common  for  even  first-rate 
modelers  and  analysts  to  misunderstand 
major  elements  of  the  problem  until  they 
have  actually  built  something  and  worked 
with  it.  While  preliminary  design  is  neces 
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Verification 


FIGURE  VI-A-1.  An  Idealized  Separation  of  System,  Model  and  Program 


sary,  it  is  seldom  sufficient  and  those  with 
modem  software  tools  tend  strongly  to 
recommend  highly  iterative  development  that 
exploits  prototyping  and  the  discovery  pro¬ 
cess  as  an  inherent  part  of  high  quality 
work,  not  something  to  be  apologized  for. 

While  I  continue  to  recommend  sepa¬ 
rating  model  design  from  design  of  detailed 
implementation,  and  while  I  still  believe  it  is 
desirable  for  many  aspects  of  a  model  to  be 


reviewed  away  from  the  computer  context, 
which  tends  still  to  encourage  a  linear  line- 
by-line  view  and  inelegant  solution  tech¬ 
niques,  the  original  ideal  is  now,  in  my 
view,  obsolete.  It  is  a  major  challenge  for 
developers  to  create  new  operating  proce¬ 
dures  that  will  maximize  benefits  of  comput¬ 
er  environments  while  maintaining  an  appro¬ 
priate  separation  of  model  and  implementa¬ 
tion  detail. 
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ANNEX  B  -  DOCUMENTATION,  HIGH-LEVEL  COMPUTER 
LANGUAGES,  AND  MODERN  MODELING  AND  ANALYSIS 

ENVmONMENrS 


6.B.I.  DOCUMENTATION 

A  prerequisite  for  VV&A  is  docu¬ 
mentation,  but  many  DoD  combat  models 
are  inadequately  documented.  To  improve 
this  situation,  it  is  important  to  know  what 
constitutes  good  documentation.  The 
DMSO’s  Applications  and  Methodology 
Woridng  Group  discussed  this  at  some 
length  in  1991,  drawing  heavily  on  experi¬ 
ence  of  the  participants,  many  of  whom  had 
actually  develo(>ed  large  models  and/or  eval¬ 
uated  them  in  detail.  It  agreed  that  the 
following  guidelines  are  especially  impor¬ 
tant: 

•  Distinguish  model  from  program 
(i.e.,  describe  the  conceptual  model 
in  terms  that  are  language  indepen¬ 
dent  and  focused  on  the  underlying 
concepts  and  relationships) 

•  When  appropriate,  describe  model  in 
object-oriented  terms,  even  if  the 
implementing  program  is  not  object 
oriented^ 

•  Require  high-level  designs  describing 
motivation,  rationale  and  basic  as¬ 
sumptions,  plus: 

—  Hierarchical  top-down  struc¬ 

tures  (where  hierarchies 
apply)  and  data-flow  dia¬ 
grams  to  show  how  inputs  get 
transformed  into  outputs 

—  Meanings  of  variables  (input 

to  data  dictionaries) 

—  Logical  or  algorithmic  detail 


on  selected  key  modules 

—  Structured  and  commented 

source  code,  even  though  this 
cannot  replace  documenta¬ 
tion,  especially  higher  level 
documentation 

—  Program  and  interface  docu¬ 

mentation  and  illustrative- 
scenario  “walkthroughs” 

Distinguishing  the  model  from  the 
program  is  important  in  sharpening  and 
communicating  concepts,  even  if  the  argu¬ 
ments  of  Annex  A  are  accepted.  Program¬ 
mers  often  talk  about  pointers,  memory, 
stacks,  arrays,  and  other  constructs  having 
nothing  to  do  with  military  phenomenology. 
Documentation  and  reviews  of  model  con¬ 
tent  should  instead  focus  on  phenomenology. 


One  important  element  of  good  docu¬ 
mentation  is  often  overlooked:  including  the 
procedures  and  results  of  any  previous 
VV&A  efforts  conducted  during  develop¬ 
ment  or  applications.  This  can  be  excep¬ 
tionally  useful.*^ 

There  are  limits  to  how  much  docu¬ 
mentation  can  be  squeezed  out  of  money- 
limited  projects.  The  most  important  docu¬ 
mentation  consists  of  “High  Level  Designs,  ” 
which  are  top-down  in  character  with  an 
emphasis  on  structure.  These  should  also 
define  key  variables,  provide  appropriate 
diagrams  showing,  e.g.,  information  flow 
and  control  flow,  and  provide  logical  or 
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algorithmic  detail  on  key  submodels.  It  is 
less  important,  and  may  even  be  inappropri¬ 
ate,  to  document  details  of  much  of  what 
constitutes  a  complex  combat  model,  since 
those  details  are  often  book  keeping  methods 
best  understood  at  the  level  of  the  code 
itself.  The  code,  however,  should  be  well 
structured  and  commented.  Another  major 
element  of  documentation  is  information  on 
how  to  use  the  program  and  its  interfaces. 
This  is  often  best  done  by  providing  a  step- 
by-step  discussion  of  how  one  runs  and 
analyzes  a  test  case  (i.e.,  a  walk  through  of 
a  representative  application  in  a  given  sce¬ 
nario).  Commercial  software  tools  often 
have  excellent  “walkthrough”  documenta¬ 
tion. 

Taken  together,  then,  there  is  need 
for  documentation  on  the  model,  the  pro¬ 
gram,  and  its  use.  Increasingly,  on-line 
documentation  is  becoming  especially  impor¬ 
tant  for  procedural  information. 

Finally,  note  that  documentation 
methods  should  be  changing,  and  that  should 
be  reflected  in  work  on  comprehensive 
environments. 

6.B.2  HIGH-LEVEL  LANGUAGES 
AND  ENVIRONMENTS 

The  phrase  “high-level  langiiage”  is 
ambiguous,  because  there  are  multiple  di¬ 
mensions  along  which  to  measure. 
SIMSCRIPT^  was  one  of  the  first  high-level 
languages  designed  for  simulation.  It  was 
high  level  in  such  respects  as  providing  tools 
making  it  easy  to  construct  simulations.  It 
also  had  mechanisms  to  force  good  program¬ 
ming  practices  such  as  writing  an  overview 
of  the  model,  using  descriptive  identifiers, 
and  exploiting  class  concepts.  In  more 
recent  times,  spreadsheet  languages  such  as 
EXCEL”  may  be  considered  very  high  level 


in  the  sense  of  having  user  friendly  interfac¬ 
es  and  a  myriad  of  predefined  functions.  At 
the  same  time,  spreadsheet  programs  are 
usually  the  antithesis  of  structured  program¬ 
ming,  because  the  approach  taken  by  the 
novice  is  to  organize  by  spreadsheet  cells 
and  use  the  equivalent  of  many  GO  TO 
statements  producing  “spaghetti  code." 
Further,  complex  spreadsheet  programs 
based  on  the  systematic  u;e  of  macros  are 
no  more  intelligible  than  those  of  other 
languages  such  as  BASIC,  and  arguably  less 
so. 

Against  this  background,  RAND  has 
been  developing  high-level  languages  that 
emphasize  using  relatively  natural  language 
for  key  words  and  that  exploit  the  cognitive 
effectiveness  of  table  structures  for  organiz¬ 
ing  both  information  and  logic.  RAND  now 
has  seven  years’  experience  with  RAND- 
ABEL”,  which  has  been  used  to  write  hun- 
dre<ls  of  thousands  of  lines  of  code.  The 
applications  have  ranged  from  decision 
models  (e.g.,  those  of  a  simulated  theater 
commander)  to  combat  models  (e.g.,  attri¬ 
tion  and  movement  processes  for  combat 
taking  place  on  a  network).  It  has  consis¬ 
tently  proven  possible  to  have  group  reviews 
of  major  portions  of  these  models  by  work¬ 
ing  directly  with  code,  even  though  many  of 
the  participants  have  not  been  serious  pro¬ 
grammers.  Errors  have  been  discovered  at 
a  glance,  and  complex  logic  has  been  dis¬ 
cussed  as  a  group.  Most  of  this  has  beelT 
possible  because  of  the  table  structures, 
which  should  be  developed  in  other  lan¬ 
guages  as  well. 

In  current  work,  RAND  is  develop¬ 
ing  an  object-oriented  version  of  RAND- 
ABEL,  called  Anabel.**  This  will  extend 
the  effort  to  exploit  two-dimensional  struc¬ 
tures  of  many  kinds  (e.g.,  decision  tables. 
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tables  of  orders,  and  adjudication  tables)  and 
will  also  include  numerous  self-documenting 
features,  including  the  use  of  hyper  media. 
Our  belief  is  that  model  documentation  w*!! 
not  improve  greatly  by  virtue  merely  of 
managers  cracking  whips.  Instead,  there  is 
both  need  and  opportunity  for  technology  to 
help.  Similar  ideas  are  being  pursued  at 
many  levels  by  a  variety  of  researchers, 
including  some  who  are  contemplating  the 
u.se  of  expert  systems  to  help  choose  and  use 
verification  and  validation  tools  (see,  e.g., 
Oren,  1986  and  Sargent,  1986).  In  addition, 
re.searchers  are  developing  a  variety  of 
excellent  graphical  tools,  some  of  them 
capable  of  generating  code  directly.  The 
Systems  Dynamics  programs  Stella"  and 
iThink“are  espex:ially  notable  here.  Plans 
call  for  a  variety  of  such  tools  to  be  used 
with  RAND’s  Anabel,  building  on  tools 
recently  developed  by  Larry  McDonough 
and  Richard  Hillestad.  One,  called 
Mapview,  allows  workers  readily  to  create 
objects  and  emplace  them  on  maps.  The 
results  of  what  they  do  with  the  graphical 
interface  generate  code.  Similarly,  a  tool 
called  the  Activity  Sequence  Editor  (ASE) 
allows  workers  to  develop  state-transition 
diagrams  for  object-oriented  programs,  and 
to  have  the  results  of  those  diagrams  gener¬ 
ate  code.  All  of  this  facilitates  documenta¬ 
tion  and  VV&A,  because  many  aspects  of 
model  design  are  best  seen  graphically,  and 
because  the  tight  linkage  between  diagrams 
and  code  avoids  the  traditional  problem  of 
documentation  lagging  the  reality  embedded 
in  the  code  itself.  Despite  the  progress, 
however,  there  is  a  great  deal  to  be  done  in 
this  general  subject  area. 


6.B.3  A  THREAT  TO  ADVANCE¬ 
MENTS 

Progress  in  developing  and  dissemi¬ 
nating  advanced  modeling  and  analysis 
methods  and  tools,  including  many  that 
would  facilitate  VV&A,  will  be  adversely 
affected  if  the  DoD  attempts  to  force  all 
modeling  activities  into  a  single  structure 
and  language,  such  as  Ada  in  particular. 
Such  a  policy  would  hinder  efforts  to  exploit 
the  rich  selection  of  commercial  products 
that  exist  and  are  emerging.  It  would  also 
hinder  efforts  to  develop  advanced  tools, 
many  of  which  are  most  readily  developed 
within  existing  computer  environments  (e.g. , 
Unix  and  Macintosh).  The  motivation  for 
commonality  is  understandable,  and  the 
desire  for  greater  reusability  and 
interoperability  of  software  is  laudable,  but 
the  requirement  for  a  single  language  is 
misplaced.  High  degrees  of  reusability  and 
interoperability  can  be  accomplished  with 
standards  that  are  language  independent. 
Indeed,  that  is  what  makes  “open  architec¬ 
tures”  feasible  and  important.  Ada  is  a 
powerful  language  that  can  greatly  contrib¬ 
ute  to  the  management  and  control  of  soft¬ 
ware  development  in  many  projects,  but  it 
is  much  less  suitable  for  prototyping,  or  for 
models  that  will  continue  to  change  and  that 
deal  with  highly  uncertain  phenomena.  For 
such  models  there  is  a  high  premium  on, 
e.g.,  interactiveness,  flexibility,  clarity, 
explanation  capabilities,  and  easy  connectiv¬ 
ity  to  commercial  tools. 
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ENDNOTES 


1.  Adapted  from  a  RAND  report  of  the  same  name,  k  4249-ACO,  Santa  Monica,  CA  1992. 

2.  Standard  references  of  this  sort  include  Brewer  and  Shubik  (1972),  U.S.  GAO  (1980),  and  U.S.  GAG 
(1987),  which  contains  an  extensive  bibliography.  One  of  the  most  famous  essays  on  the  subject  is  Stockfisch 
( 1 973).  Davis  and  Bliimenthal  (1991)  examines  broader  issues  and  argues  that  many  problems  in  combat  modeling 
stem  from  failure  of  the  military  community  to  think  in  terms  of  nurturing  a  robust  military  science.  The  near¬ 
exclusive  emphasis  on  models  as  mere  tools  has  been  an  obstacle  to  seeing  some  models  as  theories  that  need  to 
be  developed,  tested,  and  evolved  scientifically. 

3.  Another  useful  reference  is  Williams  and  Sikora  (1991),  which  provides  a  snapshot  view  of  continuing  work 
on  VV&A  by  the  Military  Operations  Research  Society  (MORS).  Readers  may  wish  to  check  for  updates  in  the 
newsletter  Phalanx.  MORS  hopes  to  publish  a  book  on  VV&A  sometime  in  1993.  'Fhis  study  may  contribute  to 
that  effort. 

4.  Some  sources  define  “simulation”  differently— as  the  operation  or  exercise  of  a  model,  or  as  a  method  of 
implementation. 

5.  As  iin  example,  consider  a  model  predicting  the  damage  expectancy  for  a  .set  of  hard  targets  as  a  function 
of  a  bomber's  availability,  reliability,  pre-launch  survivability,  penetration  probability,  bomb  load,  and  hard-target 
kill  capability.  The  bare  model  provides  an  intellectual  framework,  but  has  little  or  no  predictive  value;  its 
predictions  are  “data  driven.”  Similarly,  in  idealized  knowledge-based  systems  such  as  an  expert  system  describing 
likely  decisions  of  a  commander,  the  bare  model  may  be  a  general  “inference  engine”  for  processing  rules,  while 
the  content  of  the  model  resides  entirely  in  the  “knowledge  base”  of  rules  such  as  “If  we  can  achieve  surprise  and 
if  the  force  ratio  is  no  worse  than. ..Then  we  shall...”. 

6.  Another  way  in  which  the  classical  distinction  between  model  and  data  has  broken  down  is  with  the 
introduction  of  highly  interactive  computer  languages,  which  make  it  possible  for  users  to  change  many  equations 
and  structures  in  the  computer  code  as  easily  as  they  can  change  the  data  value  used  for  the  gravitational  constant. 
The  most  familiar  example  of  this  is  in  spreadsheet  programs,  but  other  examples  include  BASIC  and  RAND- 
ABEL.* 

7.  A  related  issue  here  is  establishing  that  the  numerical  procedures  used  are  not  introducing  cahoas  effects. 
Palmore  (1992). 

8.  Articles  on  software  engineering  sometimes  use  terms  such  as  “rigorous  audit”  or  otherwise  convey  the 

impression  of  verification  requiring  complete  testing  over  all  computational  “paths.”  Except  at  the  level  of  relatively 
small  modules,  however,  such  review  and  testing  is  usually  not  feasible.  Thus,  there  is  a  premium  on  designing 
a  doable  set  of  tests  that  will  be  likely  to  uncover  the  most  serious  problems.  - — 

9.  Some  of  the  following  discussion  draws  on  review  comments  by  Mr.  Dennis  Shea  of  the  Center  for  Naval 
Analyses.  See  also  Pace  and  Shea  (1992). 

10.  As  discussed  later  in  the  study,  there  is  a  number  of  modem  techniques  that  can  automate  or  otherwise 
assist  a  good  deal  of  verification  testing.  Many  depend  on  the  existence  of  a  data  dictionary  that  is  part  of  the 
language  or  environment,  not  a  mere  repository  of  comments. 

11.  As  of  April  1,  1992,  the  MORS  group  concerned  with  VV&A  was  using  as  a  working  definition:  “The 
process  of  determining  the  degree  to  which  a  model  is  an  accurate  representation  of  the  real  world  from  the 
perspective  of  the  intended  uses  of  the  model.” 
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12.  Other  workers  sometimes  refer  to  structural  validity  vs  output  validity.  In  that  breakdown,  output  validity 
includes  both  descriptive  and  predictive  validity. 

13.  The  subject  of  resolution  is  complex  and  analysts  often  need  to  work  with  models  with  different  resolutions 
which,  ideally,  are  consistent  in  the  aggregate.  See  Davis  and  Huber  (1992)  for  related  discussion. 

14.  One  effort  to  assess  descriptive  validity  is  described  in  Bonder  (1984),  which  examines  the  ability  of  the 
Vector-2  model  to  reproduce  the  battle  for  the  Golan  Heights  in  the  Yom  Kippur  War. 

15.  Other  decompositions  are  possible.  Based  on  discussions  at  the  MORS  SIMVAL  11  meeting.  Dr.  Dale 
Henderson  of  Los  Alamos  National  Laboratory  decomposes  the  space  of  validation  activities  into  five  dimensions: 
(a)  the  techniques  used  (e.g.,  Delphi  vs  quantitative  comparisons),  (b)  the  basis  of  truth  used  (e.g.,  historical  data 
vs.  results  of  more  detailed  simulations),  (c)  the  applications  intended  for  the  model,  (d)  the  degree  of  composition 
at  which  testing  occurs  (e.g.,  on  primitive  modules  vs  higher-level  subsystems  or  a  complete  integrated  system), 
and  (e)  the  depth  of  the  validation  effort  (e.g.,  surface-level  or  face-validity  testing).  The  principal  point  is  that 
validation  activities  are  multidimensional  rather  than  rank -ordered  or  hierarchical. 

16.  Prehistoric  man  presumably  “knew"  that  the  sun  would  come  up  every  morning  and  that  there  was  a  cycle 
of  progressively  longer  and  then  progressively  shorter  days.  He  presumably  counted  on  this  model  long  before  there 
was  any  understanding  of  astronomy. 

17.  Most  of  our  “stochastic  processes"  are  at  their  root  deterministic;  the  problem  is  our  uncertainty  about 
initial  values  and  interactions  with  other  processes,  which  causes  us  to  treat  them  as  stochastic. 

18.  In  practice,  application-specific  accreditation  usually  depends  (and  should  depend)  on  an  assessment  of  the 
people  and  organization  using  the  model,  not  merely  the  model  itself.  Indeed,  one  can  argue  that  it  is  more 
important  to  “accredit"  (or  at  least  to  assess)  people  and  organizations  than  the  tools  they  use. 

19.  A  classic  example  of  this  is  use  of  silo  hardness,  measured  in  psi.  Many  strategic-nuclear  analyses  have 
been  conducted  using  silo  hardness,  even  though  the  phenomenology  of  silo  destruction  is  complex  and  requires 
something  more  sophisticated,  such  as  a  vulnerability  number  approach  that  accounts  for  effects  of  both  st.  tic  and 
dynamic  pressures.  Analysts  can  nonetheless  get  by  with  computer  programs  or  analytic  models  using  hardness, 
because  they  do  offline  calculations  to  derive,  the  effective  hardness  of  silos  to  the  weapon  yields  of  interest. 

20.  One  can  argue  that  the  issue  of  clarity  applies  more  to  the  study  or  other  application  than  to  the  model 
itself,  but  those  interested  in  tiic  clarity  (and  reproducibility)  of  studies  are  usually  driven  toward  seeking  clarity 
of  models  as  well.  While  it  is  true  in  principle  that  analysis  with  black-box  models  can  be  clear,  given  enough 
sensitivity  testing,  my  own  experience  is  that  depending  on  such  an  approach  is  usually  a  recipe  for  disaster. 

21.  As  an  example  here,  if  one  knows  the  detailed  application,  one  can  develop  tests  of  the  integrated  system 
using  relevant  parameter  values.  Without  such  knowledge,  full-system  testing  may  be  extremely  difficult  because 
of  the  number  of  possible  combinations  possible. 

22.  See  Zuhtu  and  Oren  (1986),  Sargent  (1986),  and  Oren  (1986)  for  discussion  of  ambitious  ideas  going 
beyond  the  examples  given  here. 

23.  Most  of  these  techniques  require  an  “active  data  dictionary,  “  which  is  a  data  base  of  information  on  the 
model’s  data— e.g.,  information  on  type,  format,  acceptable  values  and  meaning.  Except  for  “meaning,”  the  infor¬ 
mation  can  be  used  automatically  to  check  source  code  and  data  values. 
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24.  Hodges  and  Dewar  (1992)  argue  that  failure  to  appreciate  this  reality  has  been  a  fundamental  source  of 
difficulty  in  the  continuing  discussions  about  validating  militaiy  models.  They  argue  that  the  word  “validation” 
should  be  reserved  for  predictive  models  that  can  be  rigorously  tested  ,  and  that  other  types  of  model  evaluation 
should  be  developed  as  a  function  of  how  the  models  are  to  be  used  (e.g.,  as  bookkeeping  devices  in  a  human  war 
game,  as  decision  aids,  and  as  devices  to  stimulate  hypotheses). 

25.  It  is  not  uncommon  for  “theories”  to  be  expressed  in  ways  that  make  it  impossible  to  disprove  them. 
Good  science,  by  contrast,  insists  that  theories  be  falsifiable.  Indeed,  scientists  go  to  considerable  lengths  to  define 
experiments  that  stress  their  theories  as  much  as  possible. 

26.  As  an  example  of  where  military  science  might  enter,  consider  the  many  theater-level  models  over  the  years 
in  which  air  forces  for  close  air  support  and  battlefield  interdiction  have  not  been  concentrated  in  time  and  space, 
thereby  diluting  their  potential  effect  on  the  other  side’s  ground-force  maneuver  and  ignoring  the  importance  of 
concentration  and  coordination  to  military  art  generally  and  to  sui-vival  and  effectiveness  of  those  air  forces 
specifically.  As  another  example,  consider  the  common  failure  to  represent  adequately  the  suppressive  effects  of 
artillery.  There  are  models,  of  course,  which  handle  both  of  these  issues  relatively  well,  but  many  military  models 
have  grossly  misrepresented  the  phenomena,  often  without  Justifying  their  simplifications  through  auxiliary 
calculations.  Detecting  such  problems  is  arguably  a  matter  of  “science,”  not  logic  or  analytic  rigor. 

27.  It  is  striking  to  note  that  theoretical  evaluation  is  commonly  (almost  always)  omitted  from  discussion  of 
validation  methods.  It  is  most  assuredly  not  the  same  as  “logical  verification”  or  “logical  testing.”  My  own  sense 
is  that  the  omission  is  another  symptom  of  military  modeling  suffering- from  not  being  part  of  a  military  science. 
It  has  perhaps  been  overly  influenced  by  mathematicians  and  programmers,  without  the  emphasis  on  phenomenology 
that  scientists  are  supposed  to  bring  to  the  table  (but  scientists  can  also  be  beguiled  by  simplistic  but  elegant 
mathematics).  An  important  role  for  military  officers,  including  retired  general  officers  serving  as  consultants,  is 
to  insist  that  modelers  pay  more  attention  to  the  real  phenomena.  They  must  demand  more  militaiy  science  if  the 
models  are  to  be  faithful  to  their  needs. 

28.  In  MORS  work  the  distinction  has  been  drawn  between  “output  validation”  and  “structural  validation." 
One  can  map  the  activities  of  Fig.  3.2  into  these  terms,  but  not  neatly.  Tlieoretical  evaluation  includes  both  stsiic- 
tural  validation  and  testing  behavior  (outputs)  in  various  special  cases  that  are  understood  with  prior  theories  or 
for  which  there  exist  solid  empirical  data.  Empirical  evaluation  in  Fig.  3.2  relates  to  output  validation  in  MORS 
terms.  “Other  comparisons”  in  Fig.  3.2  involve  both  structural  and  output  validation.  For  example,  comparisons 
to  expert  opinion  and  doctrine  can  look  both  at  assumptions  and  output. 

29.  This  view  treats  validation  as  a  matter  of  degree.  Hodges  and  Dewar  (1992)  take  a  different  approach. 

30.  As  one  reviewer  of  this  report  noted,  “doing  something”  sometimes  should  mean  doing  the  best  analysis 
possible  even  though  that  means  not  using  a  computer  model  that  sponsors  and  users  of  the  computer  model  are 
expecting  will  be  used.  This  may  be  logically  obvious,  but  it  can  be  a  problem  in  practice  because  there  are 
instances  in  which  reference  to  a  well  known  computer  model  is  thought  somehow  to  confer  a  sense  of  validity, 
legitimacy,  or  acceptability. 

31.  MacQuie  (1987)  is  an  interesting  compilation  of  historical  data  to  be  used  in  tests  of  face  validity.  The 
Army's  Concepts  Analysis  Agency  has  a  continuing  effort  to  exploit  historical  data  (see  Helmbold,  1990  for  refer¬ 
ences). 

32.  Even  more  fundamental  is  the  need  for  professional  model  development  practices  er.tphasizi.ig  module-by- 
module  testing  by  developers  as  a  routine  part  of  everyday  work.  If  more  sloppy  methods  have  been  followed,  face- 
validity  efforts  are  likely  either  to  fail  or  be  quite  misleading. 
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33.  Importantly,  much  more  extensive  testing  would  be  possible  if  it  were  budgeted.  It  is  unusual,  however, 
for  military  simulation  projects  to  set  aside,  e.g.,  20%  of  the  overall  project  funds  for  independent  and  compre¬ 
hensive  VV&A.  In  some  instances,  such  testing  would  be  well  worth  the  investment.  In  many  other  cases, 
however,  some  imperfections  are  quite  tolerable. 

34.  Some  concrete  examples  here  come  from  a  recent  evaluation  by  the  Center  for  Naval  Analyses  (CNA)  of 
a  command  and  control  model.  The  review  asked:  (a)  Have  ail  the  decision  nodes  been  identified?;  (b)  For  each 
node,  has  a  variable  been  defined  for  each  factor  that  could  affect  decisions  at  that  node?;  and  (c)  For  every  possible 
state  of  each  variable  at  each  node,  has  a  rule  been  developed  (e.g.,  an  If/Then  statement)  and  does  the  rule  reflect 
the  judgement  of  experts? 

35.  My  own  experience  with  knowledge-based  models  has  emphasized  theory  and  logic,  with  experts  being  used 
mostly  fur  spot  checking.  See,  e.g.,  Davis,  Bankes,  and  Kahan  (1986).  The  textbook  concept  of  using  “knowledge 
engineers”  to  extract  knowledge  from  experts  often  does  not  apply  or  is  less  efficient  and  organized  than  having 
a  subject-area  analyst  build  a  model  and  then  iterate  it  by  talking  with  experts.  For  a  discussion  of  the  knowledge¬ 
engineering  approach,  see  Waterman  (1986). 

36.  This  discussion  envisions  a  model  being  used  for  an  analysis  study.  However,  analogous  diagrams  could 
readily  be  constructed  for  such  other  applications  as  training,  education,  and  operational  decision  aids.  Some 
readers  may  wish  to  do  so. 

37.  There  is  an  issue  of  balance  and  complementarity  here.  Some  discussions  of  VV&A  convey  the  impression 
that  models  can  be  adequately  evaluated  once  and  for  all,  when  in  reality  model  appropriateness  must  be  judged  in 
the  context  of  an  application.  However,  studies  often  occur  with  time  pressures  and  modest  resources,  which  means 
that  they  cannot  take  on  the  full  burden  of  evaluating  models  from  scratch  and  depend  on  there  having  been  a 
considerable  degree  of  prior  VV&A.  While  Fig.  4.1  deliberately  focuses  on  VV&A  for  an  application,  both  that 
and  the  broader  VV&A  are  increasingly  considered  essential  (e.g.,  US  Army,  1992).  Personally,  I  would  argue 
that  generic  V&V  is  essential,  and  generic  accreditation  is  potentially  useful  (and  poten.  ally  troublesome), 
depending  on  organizational  sophistication,  integrity,  and  efficiency. 

38.  In  a  similar  spirit,  a  colleague  and  I  conducted  a  study  of  possible  post-crisis  defense  •'equirements  a  few 
months  before  the  allied  offensive  against  Saddam  Hussein,  in  which  we  used  an  extremely  simple  spreadsheet 
model  using  Lanchester  equations  and  aggregated  force  strengths  for  ground  combat.  The  reason  for  doing  so  was 
that  we  observed  results  of  more  sophisticated  and  complex  war  gaming  analysis  were  driven  by  a  few  factors  (e.g. , 
air-to-ground  effectiveness)  that  were  being  obscured  by  the  original  level  of  detail  (see  Shlapak  and  Davis,  1992). 
For  other  purposes,  however  (e.g.,  evaluating  offensive  capabilities),  the  simple  model  would  have  been  ludicrously 
inappropriate. 

39.  Although  not  discussed  in  this  study,  a  major  issue  is  how  the  DoD  can  create  positive  incentives  for 
VV&A.  Currently,  most  of  the  “incentives”  under  discussion  arc  in  the  nature  of  requirements  and  threats.  The 
most  obvious  incentive,  however,  is  money;  by  budgeting  appropriately  for  serious  VV&A,  The  DoD  would  quickly 
find  itself  receiving  first-rate  proposals  for  high-quality  testing.  The  second  principal  incentive  I  see  is  the  fostering 
of  an  invigorated  military  science  as  discussed  in  Davis  and  Blumenthal  (1991). 

40.  There  is  a  strongly  held  view  in  the  larger  software  community  that  good  VV&A  is  necessarily  independent 
VV&A.  Indeed,  it  is  not  uncommon  to  have  separate  organizations  charged  with  development  and  VV&A.  The 
motivation  here  is  recognizing  that  developers  often  have  profound  conflicts  of  interest  that  undercut  VV&A.  The 
pressures  include  deadlines,  cost,  the  desire  to  include  new  and  more  sophisticated  submodels,  and  the  antipathy 
of  workers  for  the  drudgery  of  extensive  testing.  An  independent  tester  paid  specifically  to  certify  software  has, 
by  contrast,  other  incentives.  At  the  same  time,  there  is  substantial  evidence  demonstrating  that  “independent 
testing”  cannot  usually  be  conducted  in  isolation:  it  is  essential  for  the  testers  to  interact  with  both  developers  and 
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users.  Developing  appropriate  working  relationships  that  balance  independence  of  judgment  with  cooperation  and 
exchange  of  information  is  therefore  important. 

41.  The  issue  of  budgeting  for  VV&A  is  fundamental,  and  the  failure  to  appreciate  this  probably  underlies 
many  of  the  VV&A  problems  in  the  militaiy  modeling  community. 

42.  As  one  example,  consider  that  program  planners  often  think  in  terms  of  aggregations  that  are  of  little  or 
no  value  to  officers  participating  in  operational  exercises.  As  a  result,  they  need  different  models.  Ideally,  the 
models  will  be  consistent,  but  that  is  not  always  easy  (Davis  and  Huber,  1992). 

43.  Sec,  e.g.,  Zeigler  (1984),  Sargent  (1986,  1987),  Gass  (1982).  and  Martin  Marrietta  (1990). 

44.  As  discussed  by  Julian  Palmore  of  the  University  of  Illinois  in  an  address  to  the  60th  MORS  conference 
in  Monterey,  California  in  June,  1992,  even  details  of  computer  arithmetic  can  be  very  impcrtam  In  simulation. 
Failure  to  pay  attention  to  such  details  can  produce  substantial  “structural  variance"  as  manifested,  e.g.,  by  peculiar 
sensitivity  results  and  major  changes  in  results  if  one  shifts  from  one  computer  to  another.  See  also  Palmore  (1992). 

45.  Separate  documentation  is  still  needed  for  gaining  a  top-down  overview  of  the  model  and  program. 
Further,  it  is  virtually  essential  when  the  program  itself  is  large.  However,  the  documentation  may  be  out  of  date 
or  may  contain  errors  that  do  not  exist  in  the  code  (and,  of  course,  the  code  may  contain  errors  not  in  the 
documentation).  My  own  view  is  that  future  reviews  of  models  should  ideally  combine  reading  of  docu'  . mentation 
for  top-down  structure  and  having  that  documentation,  which  may  also  be  on  line,  “point  to”  critical  f  ortions  of 
code  that  can  be  examined  directly.  That  will  be  increasingly  feasible  with  high-level  computer  lan^viages  and 
environments  (see  Annex  B). 

46.  One  can  design  a  model  in  terms  of  objects,  attributes,  processes,  and  the  like  whether  or  not  the 
programming  language  has  the  paraphernalia  of  objects,  messages,  methods,  and  so  on. 

47.  In  naval  modeling  a  special  need  is  discussion  of  how  environment  is  handled  in  the  modei 

48.  Anabel,  the  result  of  ideas  by  Edward  Hall  and  Norman'  Shapiro,  is  being  developed  as  part  of  a  grander 
scheme  for  a  modeling  and  analysis  environment  (see  Anderson,  Bankes,  Davis,  Hall,  and  Shapiro,  forthcoming). 
RAND-ABEL  is  documented  in  Davis  (1990)  and  Shapiro,  Hall,  Anderson,  LaCasse,  Gillogly,  and  Weissler  (1988). 
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